Across various industries, organizations are increasingly expressing concern over the escalating costs associated with artificial intelligence. For instance, Uber reportedly exhausted its entire 2026 AI coding budget by April. Similarly, Microsoft rescinded its developers’ Claude Code licenses mere months after their initial deployment. A Priceline employee informed TechCrunch that a standard contract renewal for Cursor presented a cost increase of 4 to 5 times the previous rate.
Despite a decline in per-token pricing, the growing imperative for AI adoption and the proliferation of increasingly autonomous agents have propelled overall token consumption to unprecedented levels. Companies that embraced "all-you-can-eat" subscriptions in early 2025 are now struggling to pinpoint their expenditures, curb spending, and ascertain if any return on investment can be salvaged from their overstretched budgets.
In response to this emerging challenge, a new market is rapidly taking shape. Startups, established technology vendors, and a newly formed standards body are actively competing to equip businesses with the necessary tools and common language to effectively monitor their AI-related spending.
“Six months ago, I would have a conversation with a customer and it would be all about ‘What can it do? Is it good enough?’” Alexander Embricos, OpenAI’s head of enterprise, shared with TechCrunch at a recent New York City event. “Our conversations are never about that now. Now the conversations are about, ‘hey, we’re spending so much. What visibility do you have? What auditability do you have? What token controls do you have? What is the efficiency of your models?’”
Against this backdrop, the Linux Foundation revealed its plans this week for the Tokenomics Foundation. This new standards body aims to establish the same rigorous cost discipline for AI tokens that the FinOps framework successfully introduced for cloud expenditures.
“In April and May, I started hearing from companies: ‘Oh my god, we are 3x over our entire 2026 token budget and it’s only April,’” J.R. Storment, executive director of the FinOps Foundation, a Linux Foundation project, told TechCrunch. “We started hearing existential crises, and the whole conversation shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how do we control this?’”
These widespread concerns emerged following aggressive directives from CEOs who pushed their teams to rapidly adopt the most advanced AI models, often with little regard for the associated costs. New models introduced in November, such as Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, delivered significant enhancements to agentic tools, leading to a substantial increase in token consumption. This trend famously led to one company reportedly receiving a $500 million Claude bill after failing to implement usage limits for its employees.
“It’s like the crack-cocaine epidemic,” observed Chris Reed, senior director of IT finance at Priceline, when discussing the challenges of AI pricing. “They let you try it to get you hooked on it, and now you’re kind of beholden to it.”
Vitaly Gordon, CEO of the engineering operations platform Faros AI, recounted a recent conversation with a CTO who confessed: “One of my engineers spent $40,000 on tokens last month, and I genuinely don’t know whether I should stop him or should I go and tell everyone else to be like him.”
A March survey conducted by Faros among 20,000 developers indicated a rise in output, but also a corresponding increase in bugs and necessary rewrites. Similarly, Jellyfish, an engineering management platform, discovered that engineers with the highest token usage were approximately twice as productive as those using AI less frequently, yet they consumed ten times the number of tokens to achieve this.
Nicholas Arcolano, head of research at Jellyfish, communicated via email to TechCrunch that AI expenditure is soaring largely due to agentic features, with per-developer consumption surging by approximately 18.6 times in just nine months. Cumulatively, these statistics render the case for productivity gains more ambiguous than the significant spending might suggest.
“Whether extreme spend pays off comes down to the ultimate business value of shipped code (e.g. revenue), which most companies still can’t measure,” Arcolano stated.
A significant factor contributing to this measurement difficulty is the sheer scale at which AI is currently being deployed.
“Tracking cloud costs is a hundreds-of-millions-of-rows-a-month data problem,” Storment explained. “Tracking token costs is a trillions-of-rows-a-month data problem. You can’t just stick that into whatever spreadsheet or even basic tool. You’ve got to fundamentally rethink your tooling, your specs and your accounting systems to do that.”
At Priceline, Reed is already encountering discrepancies, noting inconsistencies between a vendor’s reported usage and Priceline’s internal data.
“I started my career in telecom expense management, and I’m seeing all the same parallels, from telecom to cloud to AI,” he remarked. “Anytime you introduce something new, it’s ripe for billing errors and audit and optimization opportunities.”
This problem has spurred the formation of a dedicated market. It includes pure-play companies like Pay-i, which specializes in tracking, measuring, and optimizing the costs and performance of Generative AI investments. Meanwhile, Paid allows developers to monitor costs, measure usage, and bill users based on actual value rather than fixed subscription fees.
Additionally, firms such as Jellyfish, Waydev, and Faros AI are offering AI agent monitoring services designed to demonstrate the return on investment for developer tools. Storment notes that a majority of the 180 vendors within the FinOps Foundation are gravitating towards this particular sector.
Companies with established market presence are also integrating new features to capitalize on this burgeoning market. Ramp recently expanded into AI spend management, while Datadog and New Relic have augmented their offerings with services like cloud cost management, token-level observability, and GPU monitoring. Furthermore, AWS is anticipated to unveil new financial management capabilities tailored for enterprise AI spending at the upcoming FinOps X conference.
Tiffany Luck, a partner at NEA, anticipates that token efficiency and observability functionalities will likely be incorporated at the “harness or app layer.” She highlighted Factory, a startup that develops AI agents for enterprises, which this week launched a model router designed to automatically select the most appropriate model for each specific task.
Gordon predicts that frontier labs and other model providers will adopt OpenRouter-style optimization strategies to direct queries to the most cost-effective models—a trend already becoming apparent in enterprise Claude billing statements.
“The financial report for how much you spend on Anthropic, even if you call the Opus model, some of the spend will be on Sonnet or Haiku, because they are smart enough to do it,” Gordon explained. “I think this will become more and more of a thing.”
However, these emerging tools are being developed without a common language or standardized definitions regarding token costs, output metrics, or methods for comparing spending across different vendors. This is precisely where the Tokenomics Foundation aims to provide crucial utility.
The Foundation is actively working to establish a canonical definition and framework for “tokenomics,” alongside open standards, specifications, and metrics for AI token usage and billing. It also plans to introduce new metrics for AI economics, such as cost-per-intelligence or tokens-per-watt, and define metrics for token factory effectiveness and consumption efficiency. The group intends a formal launch in July and will announce additional members at the FinOps X conference next week.
“Token economics is fundamentally more abstract and opaque than anything we’ve managed at this scale before,” Nishant Gupta, chief availability officer at Salesforce, stated. “It requires a different operational muscle than the one the industry built for cloud.”
Despite these challenges, Goldman Sachs projects that global token usage will multiply by 24 times by 2030. Companies currently exceeding their budgets urgently require immediate solutions, yet the Tokenomics Foundation’s initial deliverables are still several months away.
“Maybe we created a steam engine, but we still haven’t figured out the assembly line,” Gordon mused.
According to Arcolano, the most prudent approach involves broad, moderate adoption of AI technologies.
“The best ROI comes from moving the broad middle from low to moderate usage, not pushing heavy users higher,” he concluded.
The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.