Claude Sonnet 4.5
Claude Sonnet 4.5 is a coding model from Anthropic with strong benchmark scores, including 77.2% on SWE-bench Verified and 61.4% on OSWorld for computer use, sustaining 30+ hour agentic coding sessions, and delivering substantial gains across coding, reasoning, math, and domain-specific expertise.
import { streamText } from 'ai'
const result = streamText({ model: 'anthropic/claude-sonnet-4.5', prompt: 'Why is the sky blue?'})Playground
Try out Claude Sonnet 4.5 by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Anthropic
| Model |
|---|
About Claude Sonnet 4.5
Claude Sonnet 4.5 launched on September 29, 2025. The OSWorld result backed Anthropic's computer use claims directly: 61.4%, up from Sonnet 4's 42.2% just four months earlier. On SWE-bench Verified, the model scored 77.2% and maintained focus for 30+ hours on complex multi-step tasks, a duration threshold that changes what's architecturally feasible for autonomous engineering work.
Domain expert evaluation reinforced the benchmark numbers. Finance, law, medicine, and STEM specialists found substantially better domain-specific knowledge and reasoning compared to older models including Opus 4.1. Devin increased planning performance by 18% and end-to-end scores by 12%, the biggest jump since Claude Sonnet 3.6. Cursor, GitHub Copilot, and Figma Make reported significant gains in their specific domains. Claude Code shipped checkpoints and rollback, a native VS Code extension, and a refreshed terminal interface alongside this model.
At release, Sonnet 4.5 included substantial alignment improvements over prior Claude models. Safety gains are concrete: substantial reductions in sycophancy, deception, power-seeking, and tendency to encourage delusional thinking. Prompt injection defense for computer use and agentic capabilities improved considerably. Anthropic released the model under ASL-3 (AI Safety Level 3) protections, the first Claude model at that safety level, with CBRN (chemical, biological, radiological, and nuclear) classifiers active.
The Claude Agent SDK launched alongside Sonnet 4.5, giving you access to the same infrastructure that powers Claude Code: memory management, permission systems, and subagent coordination for building custom agents.
What To Consider When Choosing a Provider
- Configuration: Sonnet 4.5's computer use capability is protected by ASL-3 (AI Safety Level 3) safeguards: classifiers that screen for potentially dangerous inputs and outputs. These may occasionally flag normal content. Anthropic has reduced false positive rates by a factor of 10 since the classifiers were first deployed.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Claude Sonnet 4.5
Best For
- Computer use and real-world browser/software automation: Strong results on OSWorld at release among models evaluated then
- Extended autonomous coding sessions: Documented 30+ hour capability for complex multi-step engineering tasks
- Complex agent workflows: Anthropic explicitly positioned it for agent workloads at release
- Domain-specific applications in finance, law, medicine, and STEM: Expert evaluation showed substantial gains in domain knowledge and reasoning compared to Opus 4.1
- Production deployments requiring strong alignment properties: With reduced sycophancy and deception compared to earlier Claude releases at that time
Consider Alternatives When
- Primary cost constraint: Haiku 4.5 may offer sufficient capability-per-cost for lighter workloads
- Simple latency-sensitive tasks: Sonnet 4.5's capability depth comes with higher per-token cost than lighter models
- Sonnet-tier large context: Check if Claude Sonnet 4.6 covers both the 1M tokens window and Sonnet pricing
- Earlier-model parity: Earlier models handle some specific computer use or coding tasks equivalently
Conclusion
Claude Sonnet 4.5 represents a generation step in multiple capability areas simultaneously, computer use, agentic duration, domain expertise, and safety alignment all advanced in the same release. For teams building agents that do real work in real software environments over extended periods, this is the model where those capabilities came together.
Frequently Asked Questions
What was Claude Sonnet 4.5's OSWorld score and why does it matter?
61.4%, up from Sonnet 4's 42.2% four months earlier. OSWorld measures AI performance on real-world computer tasks: navigating software, filling forms, and clicking UI elements. It focuses on operational computer-use scenarios rather than abstract reasoning alone.
How long can Claude Sonnet 4.5 maintain focus on a single agentic coding task?
More than 30 hours on complex, multi-step tasks. Anthropic noted this duration changes what's architecturally feasible for autonomous engineering work. Individual results vary by task structure.
What is ASL-3 and why does it apply to Sonnet 4.5?
ASL-3 (AI Safety Level 3) is Anthropic's framework level for models requiring additional safeguards. Sonnet 4.5 is the first Claude model released under ASL-3 protections, which include classifiers screening inputs and outputs for CBRN-related content. False positive rates have decreased by a factor of 10 since initial deployment.
What is the Claude Agent SDK and how does it relate to this model?
The Claude Agent SDK launched alongside Sonnet 4.5. It gives you access to the same agent infrastructure that powers Claude Code: memory management across long tasks, permission systems, and subagent coordination. Use it to build custom agents on the same foundation.
What alignment improvements came with Sonnet 4.5?
Substantial reductions in sycophancy, deception, power-seeking, encouragement of delusional thinking, and compliance with harmful system prompts, measured via an automated behavioral auditor. The model also improved defenses against prompt injection attacks for computer use and agentic capabilities.
Why did specialists in finance, law, medicine, and STEM find Sonnet 4.5 significantly better than previous models?
Professionals assessed domain-specific knowledge and reasoning in Anthropic's expert evaluations. Results showed substantially better performance compared to older models, including Opus 4.1. The intelligence improvements extend beyond coding benchmarks.
Is Sonnet 4.5 priced differently from Sonnet 4?
Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.