Top-tier reasoning at pennies per million tokens. The outlier in the frontier price chart.
DeepSeek V4-Pro is the model that keeps the rest of the market honest. At $0.435 input and $0.87 output per 1M tokens, with a 1M context window and frontier-grade reasoning, it sits roughly an order of magnitude under what GPT-5.5 and Claude Opus 4.7 charge. There is no asterisk — the published rates are real, the API is open, and DeepSeek hosts a free consumer chat using the same model.
The honest counterweights: DeepSeek is a Chinese company hosting in China. For US and EU enterprises, data residency and procurement review are real obstacles. Open weights mitigate this — V4-Pro can be self-hosted by anyone with the GPUs — but the hosted API is the cheapest path and not everyone can use it.
V4-Pro is strongest at math, logic, and chain-of-thought reasoning, which has been DeepSeek's calling card since R1 in early 2025. Coding is competitive with GPT-5.4 on most public benchmarks, especially on algorithmic and competition-style problems. The model emits long reasoning traces by default — useful for inspection, expensive in output tokens if you forget to bound them.
Where it trails the western frontier: nuanced writing voice, tool-use polish, and instruction-following on edge cases. UX around the API (rate limits, observability, SLAs) is also less mature than OpenAI or Anthropic.
| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
| DeepSeek V4-Pro | $0.435 | $0.87 | 1M |
| DeepSeek V4-Flash | $0.14 | $0.28 | 1M |
| Gemini 3.5 Flash | $1.50 | $9 | 1M |
| GPT-5.5 | $5 | $30 | 1M |
V4-Flash undercuts even V4-Pro for routine work. After I/O 2026, Gemini 3.5 Flash is no longer in the same budget tier as V4-Pro — the closest non-Chinese rivals on price are now Gemini 3.1 Flash-Lite at $0.25/$1.50 and GPT-5.4 mini at $0.25/$2. Against GPT-5.5, V4-Pro is the cheapest 1M-context reasoning option in this comparison by a wide margin.