← Back to API models

Claude Sonnet 5

by Anthropic · new Sonnet flagship · released Jun 30, 2026

Near-Opus quality at Sonnet pricing — now noticeably stronger on coding and agentic work. The new default Claude for most production work.

Input $3 / 1M
Output $15 / 1M
Context 1M
Tier Mid / flagship
Anthropic Console ↗ Updated July 1, 2026
§ API pricing

Per-token rates.

Input
$3/1M tokens
Prompt tokens
  • Intro $2 / 1M through Aug 31, 2026
  • Vision inputs billed as tokens
  • Prompt caching available
Output
$15/1M tokens
Completion tokens
  • Intro $10 / 1M through Aug 31, 2026
  • Matches GPT-5.4 on output exactly
  • Includes extended thinking tokens
Context
1Mtokens
Window
  • 1M context window
  • Beta tier may have separate pricing
  • Largest Claude context available
Caching
~90%discount on cache hits
Prompt caching
  • Reuse large system prompts cheaply
  • Big wins for chat with long instructions
  • 5-min and 1-hour TTLs supported

Introductory pricing of $2 / 1M input and $10 / 1M output runs through August 31, 2026, after which standard $3 / $15 rates apply.

What's new in Sonnet 5

Released June 30, 2026, Sonnet 5 is the model Anthropic expects most developers to run by default, and it's now the free-tier model in Claude.ai. It holds the same headline API rate the Sonnet line has carried — $3 input and $15 output per 1M tokens — while closing much of the remaining gap to Opus 4.8 on coding and agentic work, the areas where earlier Sonnets trailed most. Through August 31, 2026, introductory pricing drops that to $2 / $10 per 1M, making it one of the cheapest frontier-class models on the market.

The Sonnet pitch is unchanged in shape: most of Opus's capability at a meaningful discount, with better latency. What moved in Sonnet 5 is the quality ceiling — long-horizon agent loops, tool use, and structured code edits are materially better, while the 1M context window keeps Sonnet competitive with GPT-5.4 for repo-scale work.

Capabilities

Sonnet 5 is a strong all-rounder: writing, code, vision, and structured extraction. It's the right model for production chat agents and coding assistants because answers stay composed under load, and tool use is reliable across multi-turn loops. Compared with the previous Sonnet generation, the biggest gains are in agentic execution — fewer dropped instructions and cleaner self-verification on longer tasks.

Where it still trails Opus 4.8: the hardest novel reasoning and the most demanding long-document synthesis. Where it trails GPT-5.5: outright token throughput.

Typical use cases

  • Production chat assistants and customer support agents
  • Code generation, review, and IDE integrations
  • Coding agents and long-horizon tool-use loops
  • Document Q&A and summarization (1M context for repo-scale work)
  • Vision tasks: charts, screenshots, document OCR
  • Structured extraction and form filling

Sibling and rival comparison

ModelInput / 1MOutput / 1MContext
Claude Sonnet 5$3$151M
Claude Opus 4.8$5$251M
Claude Haiku 4.5$1$5200K
GPT-5.4$2.50$15272K

Sonnet 5 holds the Sonnet line's list price while raising the quality ceiling, so for new builds it's the obvious mid-tier pick. Opus 4.8 costs about 67% more for the highest-stakes work; Haiku 4.5 is a third of the price for narrower tasks. The closest rival is GPT-5.4 — slightly cheaper on input, equal on output, with the 1M context as a tiebreaker.

← See all Anthropic / Claude plans