← Back to API models

Claude Sonnet 4.6

Name: Claude Sonnet 4.6
Brand: Anthropic
Price: 3 USD

by Anthropic · workhorse · also free in Claude.ai

Near-Opus quality at Sonnet pricing. The default Claude for most production work.

Input $3 / 1M

Output $15 / 1M

Context 200K (1M β)

Tier Mid / default

Anthropic Console ↗ Updated May 20, 2026

§ API pricing

Per-token rates.

Input

$3/1M tokens

Prompt tokens

60% of Opus 4.7's input rate
Vision inputs billed as tokens
Prompt caching available

Output

$15/1M tokens

Completion tokens

60% of Opus 4.7's output rate
Matches GPT-5.4 on output exactly
Includes extended thinking tokens

Context

200K1M in beta

Window

200K stable, 1M in opt-in beta
Beta tier may have separate pricing
Largest Claude context available

Caching

~90%discount on cache hits

Prompt caching

Reuse large system prompts cheaply
Big wins for chat with long instructions
5-min and 1-hour TTLs supported

Why Sonnet 4.6 is the default

Sonnet 4.6 is the model Anthropic expects most developers and most Claude.ai users to actually run. As of April 2026 it's also the free-tier model in Claude.ai — a deliberate move to broaden the install base while Opus 4.7 keeps the quality crown. The API rate of $3 input and $15 output per 1M tokens makes it one of the most price-competitive frontier-class models in the market.

The Sonnet pitch has not really changed since 4.0: you get most of Opus's capability at a meaningful discount, plus better latency. What did change in 4.6 is that the gap to Opus narrowed again on coding and structured output, while the 1M context beta finally pushes Sonnet out of the "small-window" tradeoff that used to favor GPT-5.4 for repo-scale work.

Capabilities

Sonnet 4.6 is a strong all-rounder: writing, code, vision, and structured extraction. It's the right model for production chat agents because answers feel composed rather than scattered, and the failure modes tend to be "too cautious" rather than "confidently wrong." Tool use is reliable, and the model now handles multi-turn agent loops with fewer dropped instructions than 4.5.

Where it trails Opus 4.7: novel reasoning, long-document synthesis, and the hardest code refactors. Where it trails GPT-5.5: outright token throughput and the full 1M context outside the beta.

Typical use cases

Production chat assistants and customer support agents
Code generation, review, and IDE integrations
Document Q&A and summarization (1M β for repo-scale work)
Vision tasks: charts, screenshots, document OCR
Structured extraction and form filling

Sibling and rival comparison

Model	Input / 1M	Output / 1M	Context
Claude Sonnet 4.6	$3	$15	200K (1M β)
Claude Opus 4.7	$5	$25	200K
Claude Haiku 4.5	$1	$5	200K
GPT-5.4	$2.50	$15	272K

Opus 4.7 costs about 67% more for the highest-stakes work. Haiku 4.5 is a third of the price for narrower tasks. The closest rival is GPT-5.4 — Sonnet is slightly more on input, equal on output, with a different style and the 1M beta as a tiebreaker.

← See all Anthropic / Claude plans