CelerData Opens Core StarRocks Features to Community at Inaugural Summit

October 6, 2025

editorial_staff

CelerData made a significant strategic shift this week by releasing two of its core commercial features as open source software during the first global StarRocks Summit in Menlo Park, California. The move signals the company's commitment to community-driven development as it positions itself for what executives describe as an AI-driven future.


More than 1,500 developers, enterprise users, and community contributors gathered at the summit to hear about developments in the real-time data analytics platform. The headline announcement centered on CelerData's decision to open source StarOS and multi-warehouse capabilities, features that were previously available only to paying customers.


StarOS serves as the foundational technology that allows StarRocks to separate computing power from data storage, an architectural approach that enables the platform to scale more efficiently. The multi-warehouse feature builds on this foundation by letting organizations create independent computing clusters while sharing the same underlying dataset across their entire StarRocks deployment.


According to Alvin Zhao, who chairs the StarRocks Technical Steering Committee and serves as CTO of CelerData, the decision reflects a fundamental belief about how data platforms should evolve. "We believe the future of data platforms cannot be built by one company alone," Zhao said during his keynote presentation. "It must be built by a community."


The practical implications of this architecture are significant for enterprises dealing with multiple types of data workloads. Organizations can now run real-time analytics alongside customer-facing dashboards and AI inference operations on the same data without creating duplicate copies or compromising performance across different use cases.


Alongside the open source announcement, CelerData previewed StarRocks version 4.0, which the company says will be the most substantial update since the project launched. The new version includes improved handling of JSON data, which has become increasingly important as organizations work with the unstructured information that powers many artificial intelligence applications.


The update also brings vector search capabilities to general availability, enabling the similarity queries that underpin recommendation engines and generative AI systems. Additionally, StarRocks 4.0 deepens its integration with Apache Iceberg, allowing users to prepare and optimize data directly within the platform rather than relying on separate tools.


CelerData claims these improvements have resulted in performance gains of 60 percent compared to the previous year, though the company attributes much of this progress to contributions from its open source community rather than internal development alone.


The timing of these announcements appears deliberate as the data analytics industry grapples with the growing influence of AI agents. Unlike human users who might run occasional queries, AI agents can process vast amounts of information simultaneously and require immediate access to current data across multiple formats and sources.


Zhao argued that StarRocks was designed with these requirements in mind, noting that the platform already delivers query responses in less than a second while working seamlessly with open data formats including Iceberg, Hudi, and Delta Lake. "Agents need more than just speed," he explained. "They need real-time context, concurrency at scale, and access to unified data sources."


The StarRocks project has grown substantially since its initial release in 2021. The open source community now includes more than 500 contributors and nearly 5,000 members in its Slack workspace. CelerData reports that over 500 enterprises worldwide use the platform, including major technology companies like Pinterest, Tencent, and Expedia.


This growth occurs within the increasingly competitive lakehouse analytics market, where companies are working to eliminate data silos by unifying business intelligence and AI workloads on shared platforms. The sector has attracted significant attention as organizations seek to reduce infrastructure complexity while preparing for AI-driven applications.


By open sourcing core commercial features, CelerData joins other data infrastructure companies that have embraced community-driven development models. The approach can accelerate innovation and adoption while creating new challenges around monetization and competitive differentiation.


For CelerData, the strategy appears to bet on the idea that broader community adoption will drive demand for the commercial support and managed services the company offers alongside the open source platform. The success of this approach will likely depend on the community's response and the company's ability to maintain its technical leadership as development becomes more distributed.