As advanced agentic capabilities become a fundamental expectation for foundation model developers, Anthropic has unveiled Claude Sonnet 5. This new release represents a more potent and autonomous iteration of the company’s popular mid-tier model.
In a recent blog post, Anthropic stated, “It can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models.” This signifies a substantial leap in making sophisticated autonomous functions more accessible.
This strategic move aligns with recent announcements from industry giants like OpenAI and Google. OpenAI’s GPT-5.6 Sol, launched in preview last week, is also touted as their most agentic model, enabling users to distribute complex tasks across multiple subagents for extended autonomous operation. Similarly, Google positioned its Gemini 3.5 Flash, released in May, as a transition from a conversational chatbot to an agentic tool capable of planning, constructing, and refining real-world tasks with minimal human intervention.
Sonnet 5’s introduction solidifies the notion that agentic functionality is now the expected baseline across all price points. The competitive battleground is no longer about who can best perform agentic work, but rather how efficiently and reliably it can be executed without constant human oversight.
Anthropic promises that Sonnet 5 delivers performance comparable to its premium Opus 4.8 model, but at a significantly reduced cost. Effective immediately, Claude Sonnet 5 will serve as the default model for both free and Pro subscription plans, extending its availability to all subscribers.
For its launch period through August 31, Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens. Following this, the price will adjust to $3 per million input tokens while output tokens remain at $10 per million. This pricing positions Sonnet 5 as a more economical option compared to Opus 4.8, OpenAI’s GPT-5.5, and Gemini 3.1 Pro, though it remains slightly more expensive than Gemini 3.5 Flash.
Anthropic highlights that the new model also boasts considerable enhancements over its predecessor, Sonnet 4.6 (released in February), particularly in agentic performance areas such as reasoning, tool utilization, software coding, and general knowledge work.
Illustrating these improvements, Sonnet 5 achieved a 63.2% score on an agentic coding benchmark, placing it ahead of Sonnet 4.6’s 58.1% but slightly behind Opus 4.8’s 69.2%. Intriguingly, on a knowledge work benchmark, Sonnet 5 marginally outperformed Opus 4.8, a model renowned for its prowess in tackling complex problems requiring nuanced judgment and extensive research.
“Opus 4.8 is still the model of choice for higher accuracy on these tasks, but Sonnet 5 provides developers with lower-priced options that are of much higher quality than what was previously available,” Anthropic clarified. The company added, “Between Sonnet 5 and Opus 4.8, users can adjust the effort level to find the right balance of cost and performance.”
Feedback from testers, as detailed in the blog post, indicates that Sonnet 5 excels at completing intricate tasks where earlier model versions would have faltered, and notably, it “checks its own output without explicitly being asked.”
Daniel Shepard, a senior engineer at Zapier, provided a compelling example: “We handed Claude Sonnet 5 a two-part job—update Salesforce account tiers, send a launch announcement to enterprise contacts—and it finished end to end. That used to stall halfway. For day-to-day automation, it’s a no-brainer.”
Regarding safety, Sonnet 5 demonstrates a reduced incidence of “undesirable behaviors,” such as complicity in misuse and deception, compared to its predecessor, making it a safer choice for agentic applications. It exhibits enhanced capabilities in refusing malicious requests and thwarting prompt injection attacks. Furthermore, it shows lower rates of hallucination and sycophantic behavior than Sonnet 4.6.
However, it is important to note that Sonnet 5 does not yet match the advanced safety levels of Opus 4.8 and Claude Mythos Preview concerning misaligned behavior. The blog post explicitly states, “Evaluations also show that it has a much lower ability to perform dangerous cybersecurity tasks than our current Opus models.”
Fabian Hedin, co-founder of Lovable, affirmed Sonnet 5’s safety, stating that the model “refuses unsafe requests cleanly and consistently.”
Hedin further emphasized the broader implications, remarking, “At Lovable, we’re putting powerful tools in the hands of millions of builders. A model that knows when to say no is just as important as one that knows how to build.”
The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.
