← Back to API models

DeepSeek V4-Flash

Name: DeepSeek V4-Flash
Brand: DeepSeek
Price: 0.14 USD

by DeepSeek · budget tier

The price floor of the entire market: $0.14/$0.28 per 1M with a 1M window and real reasoning ability.

Input $0.14 / 1M

Output $0.28 / 1M

Context 1M tokens

Specialty Cheap reasoning

DeepSeek Platform ↗ Updated July 23, 2026

§ API pricing

Per-token rates.

Input

$0.14/1M tokens

Prompt tokens

Lowest general-model input rate here
A third of V4-Pro's $0.435
1M window at this price is unmatched

Output

$0.28/1M tokens

Completion tokens

Cheapest output on our ledger
A third of V4-Pro's $0.87
50× cheaper than Sonnet 5

Context

1Mtokens

Window

Full-size window, budget price
Whole codebases in one call
Same window as V4-Pro

Hosting

Chinabased infrastructure

Consideration

Hosted version routes via China
Open-weights self-hosting available
Check your compliance needs

Why V4-Flash exists

V4-Flash is the cheapest model on our API table, and it isn't a toy: it inherits the V4 family's reasoning lineage, keeps the full 1M context window, and costs $0.14 input / $0.28 output per 1M tokens. The next-cheapest US-hosted rival with a 1M window (Gemini 3.1 Flash-Lite) charges 5× more on output.

Like its big sibling V4-Pro, the considerations are non-technical: the hosted API routes through Chinese infrastructure, which is a hard blocker for some compliance regimes — and the reason open-weights self-hosting is part of DeepSeek's pitch.

Capabilities

For the price class, capability is absurd: usable reasoning on math, code, and logic, a 1M window, and throughput suited to volume pipelines. Most tasks that teams route to mini-tier US models run fine here at a fraction of the cost.

The honest weakness: polish and ecosystem. Tooling, SDK maturity, rate-limit headroom, and English prose quality all trail the US providers, and the hardest reasoning belongs to V4-Pro or a frontier model.

Typical use cases

Absolute-cost-floor volume pipelines
Long-document processing on a budget (1M window)
Math/code/logic tasks too smart for nano-tier models
Self-hosted deployments via open weights
Cheap tier under V4-Pro in a routing stack

Sibling and rival comparison

Model	Input / 1M	Output / 1M	Context
DeepSeek V4-Flash	$0.14	$0.28	1M
DeepSeek V4-Pro	$0.435	$0.87	1M
Gemini 3.1 Flash-Lite	$0.25	$1.50	1M
GPT-5.4 nano	$0.05	$0.40	272K
Mistral Small 3.1	$0.20	$0.60	128K

Only GPT-5.4 nano beats it on input price, with a quarter of the window and less reasoning. If Chinese hosting is acceptable (or you self-host), V4-Flash is the rational default for cheap volume; if not, Gemini Flash-Lite and Mistral Small are the compliant runners-up.

← See all DeepSeek plans