Skip to main content
Apr 23

Prepare for AI's Money Crunch.

The era of readily available, free artificial intelligence is drawing to a close, marked by the introduction of advertisements, stricter rate limits,

12 min read86 views3 tags
Originally reported bytheverge

The era of readily available, free artificial intelligence is drawing to a close, marked by the introduction of advertisements, stricter rate limits, restricted features, and rising prices.

This shift became starkly apparent earlier this month when millions of users of OpenClaw, a widely popular AI agent tool that had rapidly gained traction across the global tech industry, encountered significant restrictions imposed by Anthropic.

Anthropic, mirroring the challenges faced by other leading AI laboratories, was under considerable pressure to alleviate the burden on its infrastructure and commence generating revenue. Consequently, users seeking to leverage its Claude AI for their popular agents are now required to pay a substantial fee for continued access.

Boris Cherny, head of Claude Code, elaborated on this decision onX, stating, “Our subscriptions weren’t built for the usage patterns of these third-party tools. We want to be intentional in managing our growth to continue to serve our customers sustainably long-term. This change is a step toward that.”

This announcement underscores a broader trend. Investors have committed hundreds of billions of dollars to companies like OpenAI and Anthropic to facilitate their scaling and computing infrastructure development, and now they anticipate returns. After years of providing advanced AI systems at minimal or no cost, the financial implications are materializing, and end-users are beginning to experience the impact.

Over the past few years, most prominent AI labs have introduced new subscription tiers designed to cater to power users. Both OpenAI and Anthropic have revised their pricing structures for enterprise clients, with OpenAI also integrating in-platform advertisements, and Anthropic, as noted, restricting third-party tools.

In many respects, this narrative echoes historical patterns, particularly the tech boom of the 2010s. Venture capitalists heavily subsidized rapid growth in various sectors, including ride-hailing, e-commerce, and food and grocery delivery. Once these companies solidified their market position, they typically increased prices, diversified revenue streams, and delivered returns to investors—or they failed spectacularly.

However, AI companies have consumed investor capital at an unprecedented rate compared to any other sector in recent history. They have initiated the construction of data centers globally, pledging billions of dollars with promises of superior models, reduced costs, and AI accessibility for all. Even stemming the current rate of losses presents a significant challenge, let alone achieving the substantial profits investors expect. Will Sommer, a senior director analyst at Gartner specializing in economic forecasting and quantitative modeling, remarked, “When you sink trillions of dollars into data centers, you’re going to expect a return.”

“When you sink trillions of dollars into data centers, you’re going to expect a return.”

Mark Riedl, a professor at the Georgia Tech School of Interactive Computing, posed the question: “Is the era of basically free or close-to-free AI kind of coming to an end here? It’s too soon to say for certain, but there are some signs.”

Gartner's Sommer conducts extensive research into long-term economic market trends related to generative AI, including quantifying the financial stakes involved. He estimates that between 2024 and 2029, capital investment in AI data centers will reach approximately $6.3 trillion, which he describes as a “massive amount of money.”

To avert a write-down of these substantial assets, major AI model providers would ideally need to generate a return on invested capital (ROIC) of around 25 percent, according to Sommer—a figure comparable to what Amazon, Microsoft, and Google typically achieve on their overall capital investments. Conversely, if returns fall below 12 percent, institutional capital tends to disengage, seeking more lucrative opportunities elsewhere. A return below 7 percent, Sommer warns, signifies "write-down territory," which he deems “an unmitigated disaster for all of the investors in this technology.”

To achieve even that minimal 7 percent ROIC, Gartner forecasts that large AI companies would collectively need to generate nearly $7 trillion in AI-driven revenue through 2029, equating to approximately $2 trillion per year by the end of that period. For "historic returns," providers would need to earn nearly $8.2 trillion within the same timeframe.

OpenAI, as reported in February, has already committed $600 billion in spending through 2030, a figure Sommer notes is a “massive step down” from its initial plan of $1.4 trillion. Based on OpenAI’s revenue projections and potential compound annual growth, Sommer’s best-case scenario predicts that the lab would still only achieve a fraction of the total expenditure required to meet the 7 percent ROIC target.

So, how do model providers like OpenAI generate this revenue? Primarily by selling access to what are known as "tokens." A token represents a fundamental unit of data input that an AI model can comprehend and process, encompassing text, images, audio, or other modalities. Typically, one token corresponds to about four characters in the English language; for example, the word “bathroom” would likely be processed as two tokens. An average English paragraph generally comprises around 100 tokens, while a 1,500-word essay might require about 2,050 tokens, as per an OpenAI estimate.

To satisfy investors’ revenue expectations, providers would need to process a “mind-bending” quantity of tokens, Sommer stated.

Current processing volumes are already substantial. Google, for instance, announced in October that it was processing 1.3 quadrillion tokens. Aggregating estimates from all providers, Sommer suggests a range of 100 to 200 quadrillion tokens annually. However, to achieve the $2 trillion in annual revenue Gartner calculated, providers would conservatively need to generate a cumulative 10 sextillion tokens per year. (To put this into perspective, a quadrillion has 15 zeros, while a sextillion has 21.) Even assuming a very generous profit margin of 10 percent per token, this would necessitate a 50,000–100,000x increase in token consumption between now and 2030.

To hit investors’ revenue expectations, providers would need to process a “mind-bending” number of tokens.

Presently, continually seeking more data centers and constrained by computing power, companies lack the capacity to process such enormous volumes of tokens. Even if they could, they would likely be operating at a loss. Sommer estimates that if only direct infrastructure and electricity costs are considered, “every company is making very reasonable margins on every token.” Yet, this margin likely diminishes or disappears entirely with newer, more token-intensive models. Moreover, it is completely absorbed by indirect operational costs, such as expanding computing infrastructure and the “ungodly” expense of continuously training the next generation of models.

“As soon as you then add all of the infrastructure that needs to be built for the next generation of model, and you look at how these models are going to scale, it becomes increasingly untenable,” Sommer explained.

Sommer forecasts that many companies “won’t be able to sustain their burn rate,” predicting that market consolidation is virtually inevitable, with no more than two large language model providers likely to survive in any given regional market. Furthermore, the era where nearly every service offers a generous unpaid tier is unlikely to persist.

Jay Madheswaran, cofounder of legal AI startup Eve, a client of both OpenAI and Anthropic, told The Verge, “For the [labs] that have a lot of users that were free, I think the question was never really if you’d monetize the free tier but it was when, and how badly do you do it.”

Even if the financial equations can be balanced, cultivating customer loyalty presents its own complexities. Top AI labs are in a constant race, leapfrogging one another with new model releases, feature rollouts, strategic shifts, and hiring announcements. It proves challenging to maintain a dominant position long enough to capture a significant market share, especially given that engineers and developers are known for frequently switching between models, a process that is remarkably straightforward.

Consequently, labs are increasingly emphasizing the importance of locking users into their specific platforms and tools. Anthropic, which primarily serves enterprise clients, has been intensely focused on its coding initiatives. OpenAI has recently committed to mirroring Anthropic’s emphasis on coding and enterprise solutions, as both companies are reportedly vying to launch an IPO by the end of 2026.

For now, this intense competition is proving beneficial for end-users. Soham Mazumdar, cofounder and CEO of Wisdom AI, observed, “It’s an arms race where you cannot let up at all because the switching cost is zero,” adding, “As a common man, I’m going to be the winner longer-term.”

In the nascent stages of AI development, the majority of computing costs were allocated to training initial models, while inference (the process of performing tasks) was comparatively inexpensive. However, as models have advanced and systems have incorporated more features, inference has become significantly more resource-intensive. AI agents, designed to autonomously complete complex, multi-step tasks without constant human intervention, now consume vastly more tokens than the basic chatbot models of a few years ago.

Reasoning models, which increasingly power these AI agents, are particularly expensive on the inference side, according to Georgia Tech’s Riedl. These agents—such as the popular open-source platform OpenClaw—are typically more efficient and effective than non-reasoning counterparts, but they also expend considerably more tokens performing behind-the-scenes computations that are invisible to the end-user. This might involve "thinking through" numerous potential pathways, deploying sub-agents to handle parts of a task, or verifying the accuracy of various steps in a process.

Riedl explained, “You put in your one-sentence prompt… and it’ll talk out loud to itself for thousands and thousands of tokens, thousands and thousands of words, maybe even tens of thousands when you get into coding.” He added, “If you have thousands or millions of people using these things every single day, the inference costs of just the users generating tons and tons of tokens all the time really outweighs the training side of things.” If model providers were consistently profiting from these tokens and possessed ample computing capacity, this wouldn't be an issue; however, under current conditions, it represents a substantial strain.

“The use cases have exploded, and we’re out of capacity.”

Aaron Levie, CEO of Box, commented, “Anybody who was building agents in the past couple of years sort of saw this coming,” adding, “The use cases have exploded, and we’re out of capacity.”

Leading AI labs have recently revised their policies regarding API usage and third-party tools—for instance, Anthropic essentially prohibiting the use of OpenClaw unless subscribers pay an additional fee—primarily due to the increased strain. Riedl noted the behavior of these background processes: “You’ve got these tools that are basically just sitting as background processors on everyone’s laptops and desktops, just continuously waking themselves up, generating some tokens, doing some stuff, and putting themselves back to sleep.”

Furthermore, regardless of the specific application of a reasoning-model-powered AI agent, there will likely be instances of "wasted tokens." This includes times when an AI model pursues a non-useful path before backtracking, performs checks without initiating changes, or even pauses to generate a poem. In an environment where labs may be losing money on certain tokens and companies face compute limitations, the industry is striving to reduce wasted tokens and develop more focused and targeted models.

While making models more token-efficient could benefit both paying customers and AI labs, it ironically conflicts with the overarching goal of significantly increasing token usage. As Gartner's Sommer succinctly puts it, although pricing models may evolve considerably in the future, there is currently a “narrow space on the treadmill” balancing short-term and long-term objectives.

Cumulatively, these factors place major AI companies at a critical juncture: they have attracted a vast user base by offering free access, and now they must retain these users while implementing significantly higher charges. Riedl observed, “On one hand, they want to see more tokens being generated but they have to either suck up the costs, which they can sort of do as long as venture capital is flowing, or pass the costs back on to [customers]… Maybe the economics are a little upside down right now.”

Currently, OpenAI and Anthropic are frequently evaluating the benefits of traditional flat-rate subscription plans against metered fee structures. Both companies' enterprise plans are now token-based, a response to the "uneven" usage patterns, as described by Andrew Filev, founder of Zencoder—where one individual might use a service infrequently for brief periods, while another runs five agents continuously in the background.

For consumer-facing chatbots, some monetization strategies involve advertising.

OpenAI recently integrated ads into ChatGPT, appearing as a separate sidebar, and is reportedly developing tools to monitor their effectiveness. (Anthropic famously criticized this move in its 2026 Super Bowl advertisements.)

However, for companies that build tools on top of foundational models like GPT-5 or Claude Opus, the cost of tokens is increasing, with the additional expense largely being passed down to their customers. Multiple tech companies interviewed by The Verge indicated that they, or their clients, are adjusting their strategies to mitigate the impact of new pricing. Some are contemplating a complete or partial migration to open-source models, while others are dedicating substantial time and resources to evaluate how high-end models perform on specific tasks compared to more affordable alternatives.

David DeSanto, CEO of software company Anaconda, recently concluded a five-week global trip engaging with customers. He reported that many are transitioning to self-hosting AI models—deploying their own within Amazon Bedrock or Google’s Vertex AI to gain greater control over the supply chain—or shifting to open-source or open-weight models for a significant portion of their needs, given the recent improvements in benchmarks for many such models. Some companies also express concerns about the security of transmitting intellectual property to commercial frontier labs, opting to use ChatGPT or Claude models exclusively for “mission-critical applications,” he noted.

DeSanto elaborated, “Everyone I spoke to had some version of this problem — their token usage has gone up, so their usage-based billing cost has gone up, or the tier they were on no longer has the same cap, and now they’re having to go to a more expensive tier to try to keep the same amount of usage per month as part of their flat rate.”

Eve, a company providing software to plaintiff lawyers, constantly navigates the balance between quality and token costs, according to Madheswaran—especially as Eve's token usage has escalated by 100x year-over-year. Consequently, it frequently switches between open-source models and various offerings from Anthropic and OpenAI.

However, even a one percent decline in output quality negatively impacts Eve’s customers “quite significantly,” Madheswaran emphasized, which is why Eve invests considerable internal resources into meticulously tracking model quality. The company typically finds itself utilizing the newer, more expensive reasoning models for approximately 25–30% of its operations, distributing the remaining usage among other options.

ES
Editorial StaffEditor

The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.

View all posts
Reader feedback

What did you think of this story?

User Comments

Filter:
No comments yet. Be the first to comment!
Continue reading
View all news