The escalating demand for computational power to execute Artificial Intelligence models has intensified, yet the industry faces two significant hurdles: securing the appropriate specialized chips and effectively deploying them within data centers to begin revenue generation.
General Compute, an innovative inference neocloud that provides AI processing capabilities specifically for the operational phase of models—when they are responding to users rather than undergoing training—offers solutions that illuminate the future direction of the AI ecosystem. These strategic insights were pivotal in the company successfully raising a $15 million seed round, achieving a $60 million post-money valuation, led by FUSE VC with contributions from Carya Venture Partners and Village Global Ventures.
The initial challenge revolves around identifying the ideal chip. While Graphics Processing Units (GPUs) have seen unprecedented demand, an emerging consensus suggests they are not optimally suited for running AI models once their training is complete. The inference phase of AI, where a model actively generates responses, demands distinct computational requirements compared to training, prompting the development of a new category of purpose-built chips. Industry movements such as Nvidia's $20 billion acquisition of Groq in December and Cerebras' $57 billion IPO last week underscore this shift.
Given the capacity constraints observed at these leading specialized chip companies, General Compute's co-founders, CEO Finn Puklowski and CTO Jason Goodison, identified an alternative. They are now utilizing specialized chips developed by SambaNova, an Intel-backed chipmaker focused on inference, which had previously maintained a relatively lower profile within Silicon Valley discussions.
This dynamic is poised for change with SambaNova's planned release of new chips later this year. The forthcoming architecture boasts greater flexibility and incorporates expanded memory to retain context during inference calculations. SambaNova asserts that these chips deliver superior performance not only compared to GPUs but also against other specialized chips from competitors like Groq or Cerebras. Puklowski highlights the significant performance improvement, stating the new chips will generate 600 to 700 tokens per second, a substantial leap from approximately 250 tokens per second typically achieved by GPUs.
General Compute has placed a substantial order for $300 million worth of SambaNova’s SN50 chips and anticipates being the inaugural neocloud to deploy this advanced technology.
Crucially, these chips also provide a solution to General Compute's second major challenge: deployment logistics. Their air-cooled design, in contrast to water-cooled alternatives, coupled with lower power consumption, facilitates installation within existing data center infrastructures, thereby eliminating the need for new, costly facility investments.
Puklowski is actively pursuing colocation agreements—arrangements where General Compute's hardware is hosted within external facilities—not only with traditional data center providers but also with cryptocurrency miners. These miners are increasingly looking to repurpose their infrastructure as the operational cost of producing Bitcoin has frequently surpassed its market price.
General Compute formally launched its cloud offering last week, confidently asserting its current position as the fastest platform for running MiniMax 2.7, a powerful open-source Large Language Model (LLM).
Joe Hassleman, a venture investor who entered the inference boom early with an investment in Groq in 2021, recently launched Evercrest Partners, a new fund dedicated to the AI sector. General Compute marks his fund's inaugural investment. Hassleman draws compelling parallels between SambaNova’s partnership with General Compute and other significant industry collaborations, such as Coreweave’s relationship with Nvidia, and Groq’s historical integration of chip manufacturing with its own cloud services.
“They do need a healthy mix of customers that are going to put their chips in environments that are going to have high growth to them,” Hassleman noted. He further emphasized the reciprocal nature of the alliance, stating, “As much as General Compute is making a bet on SambaNova, SambaNova is making a bet on General Compute.”
The fundamental question remains which computer architecture will capture the most value in the evolving AI landscape. Inference clouds represent an implicit wager on a future characterized by a multitude of models and autonomous agents, where no single provider achieves market dominance, and the speed and cost of inference emerge as pivotal competitive differentiators. This perspective is reinforced by OpenRouter's recent $113 million Series B funding this week, reflecting the company’s strategic ability to provide customers access to a diverse array of models, enabling optimized token expenditure.
Speed is paramount in this equation, influencing both pricing and overall capability. Puklowski aims to drastically reduce hour-long workloads for coding agents to mere five- or ten-minute tasks and to enhance the economic viability of audio agents for customer service, which demand faster inference for seamless conversation. “If you use ChatGPT and it gives you 50 tokens per second, that’s still a heck of a lot faster than we can read,” Puklowski explained to TechCrunch. He added, “Now that things have moved to agent-to-agent, where agents are out there reading on our behalf or pinging databases, they need to go faster.”
The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.