Anthropic claims that the Claude Opus 4 is the best coding model in the world and was able to operate autonomously for 7 hours during customer tests.
Claude Opus 4 is Anthropic's most powerful AI model to date, according to the company's announcement, and is capable of continuously working on long-running tasks for "several hours," The Verge reports .
Tests showed that Opus 4 outperformed Google's Gemini 2.5 Pro, OpenAI's o3 for logic, and the GPT-4.1 model in coding tasks and using "tools" such as web search.
Claude Sonnet 4 is a more accessible and performance-oriented model that is better suited for general tasks and replaces the 3.7 Sonnet model released in February. Anthropic claims that Sonnet 4 provides “excellent coding and reasoning” while providing more accurate answers.
The company adds that both models are 65% less likely to use shortcuts and loopholes to complete tasks compared to 3.7 Sonnet , and they are better at preserving key information for long-term tasks when developers give Claude local access to files.
Source: The Verge
A new feature for both Claude 4 models is “reasoning summaries,” which condense the chatbots’ reasoning process into easy-to-understand insights. The beta also launches an “extended thinking” feature that allows users to switch models between reasoning modes or use tools to improve performance and response accuracy.
Claude Opus 4 and Sonnet 4 are available on the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI platform, and both models are included in Claude’s paid plans along with the augmented thinking beta feature. Free users only have access to Claude Sonnet 4 for now.
In addition to the new models, Anthropic’s Claude Code command-line agent tool is now publicly available after a limited preview in February. Anthropic also says it is moving to “more frequent model updates” as it tries to keep up with competition from OpenAI, Google, and Meta.