← Back to API models

DeepSeek V4-Flash

by DeepSeek · budget tier

The price floor of the entire market: $0.14/$0.28 per 1M with a 1M window and real reasoning ability.

Input $0.14 / 1M
Output $0.28 / 1M
Context 1M tokens
Specialty Cheap reasoning
DeepSeek Platform ↗ Updated June 10, 2026
§ API pricing

Per-token rates.

Input
$0.14/1M tokens
Prompt tokens
  • Lowest general-model input rate here
  • A third of V4-Pro's $0.435
  • 1M window at this price is unmatched
Output
$0.28/1M tokens
Completion tokens
  • Cheapest output on our ledger
  • A third of V4-Pro's $0.87
  • 50× cheaper than Sonnet 4.6
Context
1Mtokens
Window
  • Full-size window, budget price
  • Whole codebases in one call
  • Same window as V4-Pro
Hosting
Chinabased infrastructure
Consideration
  • Hosted version routes via China
  • Open-weights self-hosting available
  • Check your compliance needs

Why V4-Flash exists

V4-Flash is the cheapest model on our API table, and it isn't a toy: it inherits the V4 family's reasoning lineage, keeps the full 1M context window, and costs $0.14 input / $0.28 output per 1M tokens. The next-cheapest US-hosted rival with a 1M window (Gemini 3.1 Flash-Lite) charges 5× more on output.

Like its big sibling V4-Pro, the considerations are non-technical: the hosted API routes through Chinese infrastructure, which is a hard blocker for some compliance regimes — and the reason open-weights self-hosting is part of DeepSeek's pitch.

Capabilities

For the price class, capability is absurd: usable reasoning on math, code, and logic, a 1M window, and throughput suited to volume pipelines. Most tasks that teams route to mini-tier US models run fine here at a fraction of the cost.

The honest weakness: polish and ecosystem. Tooling, SDK maturity, rate-limit headroom, and English prose quality all trail the US providers, and the hardest reasoning belongs to V4-Pro or a frontier model.

Typical use cases

  • Absolute-cost-floor volume pipelines
  • Long-document processing on a budget (1M window)
  • Math/code/logic tasks too smart for nano-tier models
  • Self-hosted deployments via open weights
  • Cheap tier under V4-Pro in a routing stack

Sibling and rival comparison

ModelInput / 1MOutput / 1MContext
DeepSeek V4-Flash$0.14$0.281M
DeepSeek V4-Pro$0.435$0.871M
Gemini 3.1 Flash-Lite$0.25$1.501M
GPT-5.4 nano$0.05$0.40272K
Mistral Small 3.1$0.20$0.60128K

Only GPT-5.4 nano beats it on input price, with a quarter of the window and less reasoning. If Chinese hosting is acceptable (or you self-host), V4-Flash is the rational default for cheap volume; if not, Gemini Flash-Lite and Mistral Small are the compliant runners-up.

← See all DeepSeek plans