Multiverse Computing Unleashes Compressed AI to the Mainstream

Originally reported bytechcrunch

With private company defaults now exceeding 9.2%—a multi-year high—venture capital firm Lux Capital recently advised AI-dependent companies to secure their compute capacity commitments in writing. Lux highlighted that as financial instability ripples through the AI supply chain, informal agreements are no longer sufficient.

However, an entirely different strategy is emerging: eliminating reliance on external compute infrastructure altogether. Smaller AI models, designed to run directly on a user’s device—bypassing data centers, cloud providers, and associated counterparty risks—are rapidly improving to a point where they warrant serious consideration. Multiverse Computing is positioning itself as a key player in this evolving landscape.

This Spanish startup has maintained a relatively low profile compared to some of its counterparts, but its visibility is increasing with the growing demand for AI efficiency. After successfully compressing models from major AI labs, including OpenAI, Meta, DeepSeek, and Mistral AI, the company has launched both an application that demonstrates the capabilities of its compressed models and an API portal, providing developers with a gateway to access and build with these models.

The CompactifAI app, which shares its name with Multiverse’s quantum-inspired compression technology, operates as an AI chat tool, similar to ChatGPT or Mistral’s Le Chat. Users can pose questions and receive answers from the model. The distinguishing feature is Multiverse’s embedded Gilda, a model so compact that it can run locally and entirely offline, according to the company.

For end-users, this offers a tangible experience of AI on the edge, ensuring data remains on their devices and eliminating the need for an internet connection. A crucial caveat, however, is the requirement for sufficient RAM and storage on mobile devices; many older iPhones, for example, may not meet these specifications. If device resources are inadequate, the app automatically switches to cloud-based models via an API. This seamless routing between local and cloud processing is managed by a system Multiverse has named Ash Nazg, a reference that will resonate with “The Lord of the Rings” fans. Yet, when the app routes to the cloud, it inherently forfeits its primary privacy advantage.

These limitations suggest that CompactifAI is not yet prepared for widespread consumer adoption, though mass market penetration may not be its ultimate goal. Data from Sensor Tower indicates the app recorded fewer than 5,000 downloads in the past month.

The true target audience for Multiverse is businesses. Today, the company is launching a self-serve API portal, granting developers and enterprises direct access to its compressed models without requiring intermediaries like the AWS Marketplace.

“The CompactifAI API portal now gives developers direct access to compressed models with the transparency and control needed to run them in production,” stated CEO Enrique Lizaso in a press release.

Real-time usage monitoring is a deliberate and key feature of the API. Beyond the potential benefits of edge deployment, reduced compute costs stand out as a primary reason why enterprises are increasingly considering smaller models as a compelling alternative to large language models (LLMs).

The increasing sophistication of small models further bolsters their appeal. Earlier this week, Mistral updated its small model family with the introduction of Mistral Small 4, which it claims is simultaneously optimized for general chat, coding, agentic tasks, and reasoning. The French company also released Forge, a system enabling enterprises to build custom models, including small models where they can precisely define the trade-offs best suited for their specific use cases.

Multiverse’s recent performance also suggests a narrowing gap with traditional LLMs. Its latest compressed model, HyperNova 60B 2602, is built upon gpt-oss-120b, an OpenAI model with publicly available underlying code. The company asserts that HyperNova now delivers faster responses at a lower cost than its original source, a significant advantage, particularly for agentic coding workflows where AI autonomously completes complex, multi-step programming tasks.

Developing models that are both small enough for mobile devices and sufficiently useful presents a considerable engineering challenge. Apple Intelligence addresses this by combining an on-device model with a cloud model. While Multiverse’s CompactifAI app can also route requests to gpt-oss-120b via API, its core mission is to demonstrate that local models like Gilda and its future iterations offer distinct advantages that transcend mere cost savings.

For professionals in critical fields, a model capable of local operation without cloud connectivity provides enhanced privacy and resilience. However, the greater value lies in the diverse business use cases this technology can unlock—for instance, integrating AI into drones, satellites, and other environments where consistent network connectivity cannot be guaranteed.

The company currently serves over 100 global customers, including prominent entities like the Bank of Canada, Bosch, and Iberdrola. Expanding this customer base could facilitate further funding, particularly after securing a $215 million Series B round last year. Current market rumors suggest Multiverse is now pursuing a fresh €500 million funding round, potentially at a valuation exceeding €1.5 billion.

#AI #News #Tech

Editorial StaffEditor

The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.

Multiverse Computing Unleashes Compressed AI to the Mainstream

What did you think of this story?

User Comments

xAI's Anthropic Deal: What's the Catch?

Wispr Flow's Audacious Bet on India's Voice AI Challenge

Heard AI Terms? Stop Nodding, Start Understanding.