← Back to API models

Claude Haiku 4.5

Name: Claude Haiku 4.5
Brand: Anthropic
Price: 1 USD

by Anthropic · budget tier · released October 2025

The fastest, cheapest Claude. Near Sonnet-4 quality for high-volume, latency-sensitive work.

Input $1 / 1M

Output $5 / 1M

Context 200K tokens

Released Oct 2025

Anthropic Console ↗ Updated July 23, 2026

§ API pricing

Per-token rates.

Input

$1/1M tokens

Prompt tokens

A third of Sonnet 5's rate
Vision inputs billed as tokens
Prompt caching reduces re-sent inputs

Output

$5/1M tokens

Completion tokens

Lowest output rate in the Claude family
64K max output tokens
10× cheaper than Fable 5 output

Context

200Ktokens

Window

Same window as Sonnet & Opus
Unusually large for a budget model
~150K words of practical input

Speed

FastestClaude model

Latency tier

Built for real-time products
Separate rate-limit pool from Haiku 3.x
Pairs well as a sub-agent model

Why Haiku 4.5 exists

Every model family needs a workhorse, and Haiku 4.5 is Anthropic's. At $1 input and $5 output per 1M tokens it costs a fifth of Opus 5 and a third of Sonnet 5, with a 200K context window. That's smaller than the 1M Opus and Sonnet now offer, but ample for the chat, extraction, and routing tasks Haiku is built for — and far larger than most budget models give you.

The quality positioning is honest: near Sonnet-4 level — the 2025 Sonnet, not the current 4.6. For classification, extraction, routing, summarization, and templated generation, that's more than enough, and the speed is the actual product.

Capabilities

Haiku 4.5 handles structured tasks reliably: tagging and classification at volume, JSON extraction from messy text, first-pass document triage, chat where sub-second response matters more than depth. It supports vision and the standard Claude tool-use surface, so it slots into agent stacks as the fast inner-loop model while a bigger sibling does the planning.

The honest weakness: hard, multi-step reasoning. When a task needs the model to be right in a way that takes deliberation — legal nuance, subtle code review, research synthesis — step up to Sonnet 5 or beyond. Haiku will give you a fast answer, not always a deep one.

Typical use cases

High-volume classification, tagging, and routing pipelines
Structured extraction (JSON, tables) from documents at scale
Latency-sensitive chat and autocomplete features
Sub-agent / inner-loop calls inside larger agent systems
First-pass triage before escalating to Sonnet or Opus

Sibling and rival comparison

Model	Input / 1M	Output / 1M	Context
Claude Haiku 4.5	$1	$5	200K
Claude Sonnet 5	$3	$15	1M
GPT-5.4 mini	$0.25	$2	272K
Gemini 3.1 Flash-Lite	$0.25	$1.50	1M
DeepSeek V4-Flash	$0.14	$0.28	1M

Cross-family, Haiku is the premium budget model: GPT-5.4 mini, Gemini Flash-Lite, and DeepSeek V4-Flash all undercut it on price, some by a lot. You pay Haiku's rate for the Claude qualities — instruction-following, long-context reliability, and drop-in compatibility with a Sonnet/Opus/Fable stack. If your pipeline is provider-agnostic and purely cost-driven, the rivals win; if it's a Claude shop, Haiku is the obvious cheap tier.

← See all Anthropic / Claude plans