← Back to API models

Claude Haiku 4.5

by Anthropic · budget tier · released October 2025

The fastest, cheapest Claude. Near Sonnet-4 quality for high-volume, latency-sensitive work.

Input $1 / 1M
Output $5 / 1M
Context 200K tokens
Released Oct 2025
Anthropic Console ↗ Updated June 10, 2026
§ API pricing

Per-token rates.

Input
$1/1M tokens
Prompt tokens
  • A third of Sonnet 4.6's rate
  • Vision inputs billed as tokens
  • Prompt caching reduces re-sent inputs
Output
$5/1M tokens
Completion tokens
  • Lowest output rate in the Claude family
  • 64K max output tokens
  • 10× cheaper than Fable 5 output
Context
200Ktokens
Window
  • Same window as Sonnet & Opus
  • Unusually large for a budget model
  • ~150K words of practical input
Speed
FastestClaude model
Latency tier
  • Built for real-time products
  • Separate rate-limit pool from Haiku 3.x
  • Pairs well as a sub-agent model

Why Haiku 4.5 exists

Every model family needs a workhorse, and Haiku 4.5 is Anthropic's. At $1 input and $5 output per 1M tokens it costs a fifth of Opus 4.8 and a third of Sonnet 4.6, while keeping the same 200K context window — which is the unusual part. Most budget models cut the window first; Haiku keeps it, so the cheap tier can still read the same long documents the expensive tiers do.

The quality positioning is honest: near Sonnet-4 level — the 2025 Sonnet, not the current 4.6. For classification, extraction, routing, summarization, and templated generation, that's more than enough, and the speed is the actual product.

Capabilities

Haiku 4.5 handles structured tasks reliably: tagging and classification at volume, JSON extraction from messy text, first-pass document triage, chat where sub-second response matters more than depth. It supports vision and the standard Claude tool-use surface, so it slots into agent stacks as the fast inner-loop model while a bigger sibling does the planning.

The honest weakness: hard, multi-step reasoning. When a task needs the model to be right in a way that takes deliberation — legal nuance, subtle code review, research synthesis — step up to Sonnet 4.6 or beyond. Haiku will give you a fast answer, not always a deep one.

Typical use cases

  • High-volume classification, tagging, and routing pipelines
  • Structured extraction (JSON, tables) from documents at scale
  • Latency-sensitive chat and autocomplete features
  • Sub-agent / inner-loop calls inside larger agent systems
  • First-pass triage before escalating to Sonnet or Opus

Sibling and rival comparison

ModelInput / 1MOutput / 1MContext
Claude Haiku 4.5$1$5200K
Claude Sonnet 4.6$3$15200K (1M β)
GPT-5.4 mini$0.25$2272K
Gemini 3.1 Flash-Lite$0.25$1.501M
DeepSeek V4-Flash$0.14$0.281M

Cross-family, Haiku is the premium budget model: GPT-5.4 mini, Gemini Flash-Lite, and DeepSeek V4-Flash all undercut it on price, some by a lot. You pay Haiku's rate for the Claude qualities — instruction-following, long-context reliability, and drop-in compatibility with a Sonnet/Opus/Fable stack. If your pipeline is provider-agnostic and purely cost-driven, the rivals win; if it's a Claude shop, Haiku is the obvious cheap tier.

← See all Anthropic / Claude plans