Anthropic Unveils Cost-Effective, Powerful Opus 4.5 Model
Anthropic has introduced its latest model, Opus 4.5, aiming to enhance user interactions and coding performance. This model includes significant advancements that elevate its competitiveness, particularly against OpenAI’s recent offerings.
Key Features of Opus 4.5
The most notable upgrade in Opus 4.5 is its enhanced user experience across multiple platforms—web, mobile, and desktop. Users will find that conversations are less likely to end abruptly, a common issue in previous models.
Improved Conversation Continuity
- Claude can now manage longer conversations without stopping unexpectedly.
- Each conversation now involves a behind-the-scenes summarization process, retaining key points while discarding less important details.
- This upgrade is applicable not just to Opus 4.5 but also to existing Claude models.
Performance Metrics
Opus 4.5 has achieved a remarkable accuracy score of 80.9% on the SWE-Bench Verified benchmark. This score makes it the highest-performing model to date, surpassing OpenAI’s GPT-5.1-Codex-Max, which scored 77.9%, and Google’s Gemini 3 Pro, which achieved 76.2%.
Benchmark Comparisons
| Model | Accuracy Score (%) | Specialized Performance |
|---|---|---|
| Opus 4.5 | 80.9 | Agentic coding, tool use |
| GPT-5.1-Codex-Max | 77.9 | N/A |
| Gemini 3 Pro | 76.2 | Visual reasoning |
Although Opus 4.5 excels in certain benchmarks, it is important to note it does not perform as well as GPT-5.1 in visual reasoning tasks, specifically in the MMMU benchmark.
Conclusion
Opus 4.5 signifies a significant step forward for Anthropic, enhancing user experience and model accuracy. As users demand more robust and reliable AI tools, Opus 4.5 positions itself as a serious contender in the evolving landscape of AI technologies.