Claude Sonnet 4.5
Claude Sonnet 4.5 is a coding model from Anthropic with strong benchmark scores, including 77.2% on SWE-bench Verified and 61.4% on OSWorld for computer use, sustaining 30+ hour agentic coding sessions, and delivering substantial gains across coding, reasoning, math, and domain-specific expertise.
import { streamText } from 'ai'
const result = streamText({ model: 'anthropic/claude-sonnet-4.5', prompt: 'Why is the sky blue?'})Playground
Try out Claude Sonnet 4.5 by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Ask Claude Sonnet 4.5 anything to try it out.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Anthropic
| Model |
|---|
About Claude Sonnet 4.5
Claude Sonnet 4.5 launched on September 29, 2025. The OSWorld result backed Anthropic's computer use claims directly: 61.4%, up from Sonnet 4's 42.2% four months earlier. On SWE-bench Verified, Claude Sonnet 4.5 scored 77.2% and maintained focus for 30+ hours on complex multi-step tasks, a duration threshold that changes what's architecturally feasible for autonomous engineering work.
Domain expert evaluation reinforced the benchmark numbers. Finance, law, medicine, and STEM specialists found substantially better domain-specific knowledge and reasoning compared to older models including Opus 4.1. Devin increased planning performance by 18% and end-to-end scores by 12%, the biggest jump since Claude Sonnet 3.6. Cursor, GitHub Copilot, and Figma Make reported significant gains in their specific domains. Claude Code shipped checkpoints and rollback, a native VS Code extension, and a refreshed terminal interface alongside this model.
At release, Claude Sonnet 4.5 included substantial alignment improvements over prior Claude models. Safety gains are concrete: substantial reductions in sycophancy, deception, power-seeking, and tendency to encourage delusional thinking. Prompt injection defense for computer use and agentic capabilities improved considerably. Anthropic released Claude Sonnet 4.5 under ASL-3 (AI Safety Level 3) protections, the first Claude model at that safety level, with CBRN (chemical, biological, radiological, and nuclear) classifiers active.
The Claude Agent SDK launched alongside Claude Sonnet 4.5, giving you access to the same infrastructure that powers Claude Code: memory management, permission systems, and subagent coordination for building custom agents.
What To Consider When Choosing a Provider
- Configuration: Claude Sonnet 4.5's computer use capability is protected by ASL-3 (AI Safety Level 3) safeguards: classifiers that screen for potentially dangerous inputs and outputs. These may occasionally flag normal content. Anthropic has substantially reduced false positive rates since the classifiers were first deployed.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Claude Sonnet 4.5
Best For
- Computer use and software automation: Strong results on OSWorld at release among models evaluated then
- Extended autonomous coding sessions: Documented 30+ hour capability for complex multi-step engineering tasks
- Complex agent workflows: Anthropic explicitly positioned Claude Sonnet 4.5 for agent workloads at release
- Finance, law, medicine, and STEM applications: Expert evaluation showed substantial gains in domain knowledge and reasoning compared to Opus 4.1
- Strong alignment properties: Reduced sycophancy and deception compared to earlier Claude releases at that time
Consider Alternatives When
- Primary cost constraint: Haiku 4.5 may offer sufficient capability-per-cost for lighter workloads
- Simple latency-sensitive tasks: Claude Sonnet 4.5's capability depth comes with higher per-token cost than lighter models
- Sonnet-tier large context: Claude Sonnet 4.6 covers both 1M tokens context and Sonnet pricing
- Earlier-model parity: Earlier models handle some specific computer use or coding tasks equivalently
Conclusion
Claude Sonnet 4.5 represents a generation step in multiple capability areas simultaneously: computer use, agentic duration, domain expertise, and safety alignment all advanced in the same release. For teams building agents that do real work in real software environments over extended periods, this is the release where those capabilities came together.