Anthropic’s Claude Sonnet 4.5 pushes 30-hour autonomous coding

October 21, 2025

editorial_staff

Anthropic has unveiled Claude Sonnet 4.5, a new AI model built to act more independently and code at scale. In an internal test, the system ran by itself for 30 hours to build a chat app similar to Slack or Microsoft Teams, producing about 11,000 lines of code before stopping when the job was done. That runtime marks a big jump from Anthropic’s earlier Opus 4 model, which drew attention in May for working autonomously for seven hours.


The company is casting Sonnet 4.5 as its strongest model yet for real-world agents, software development, and hands-on computer use. It builds on Anthropic’s Computer Use capability introduced nearly a year ago, which lets models navigate and operate apps. Anthropic says Sonnet 4.5 is especially effective in cybersecurity, financial services, and research. Beta testers include Canva, which reported the model helped with long, complex tasks across its codebase, product features, and research work.


The release lands amid a fast-moving race to make AI assistants that handle everyday chores for consumers and heavy lifting for teams. Tech giants and startups alike are shipping steady upgrades designed to research topics, plan schedules, generate slides, help write and review code, and analyze spreadsheets. Just days earlier, OpenAI announced Pulse, a new ChatGPT feature meant to fit into users’ daily routines by surfacing timely information.

To speed up agent development, Anthropic is pairing the model with access to virtual machines, memory, context management, and multi-agent support. Together, these tools aim to help developers design agents that can stay on task longer, coordinate steps, and work across different apps and data without constant human input.


Anthropic positions Claude Sonnet 4.5 as a lead option for anyone building production-ready AI agents or coding tools. The headline claim is clear: longer autonomous runs, practical computer control, and a toolkit aimed at turning experimental demos into reliable, day-to-day software.