← Back to API models

DeepSeek V4-Pro

by DeepSeek AI · Hangzhou, China · 2026 flagship

Top-tier reasoning at pennies per million tokens. The outlier in the frontier price chart.

Input $0.435 / 1M
Output $0.87 / 1M
Context 1M tokens
Weights Open
DeepSeek Platform ↗ Updated May 20, 2026
§ API pricing

Per-token rates.

Input
$0.435/1M tokens
Prompt tokens
  • ~11× cheaper than GPT-5.5 input
  • ~8.7% of Opus 4.7's input rate
  • Chain-of-thought billed in output
Output
$0.87/1M tokens
Completion tokens
  • ~34× cheaper than GPT-5.5 output
  • ~29× cheaper than Opus 4.7 output
  • Includes reasoning trace tokens
Context
1Mtokens
Window
  • Same window as GPT-5.5 and Gemini 3.1 Pro
  • No tiered pricing above 200K
  • Long context at long-context-friendly prices
Free chat
$0chat.deepseek.com
Consumer
  • V4-Pro is free in DeepSeek chat
  • No ads, no paywall, no daily cap
  • Web, iOS, Android

The price story

DeepSeek V4-Pro is the model that keeps the rest of the market honest. At $0.435 input and $0.87 output per 1M tokens, with a 1M context window and frontier-grade reasoning, it sits roughly an order of magnitude under what GPT-5.5 and Claude Opus 4.7 charge. There is no asterisk — the published rates are real, the API is open, and DeepSeek hosts a free consumer chat using the same model.

The honest counterweights: DeepSeek is a Chinese company hosting in China. For US and EU enterprises, data residency and procurement review are real obstacles. Open weights mitigate this — V4-Pro can be self-hosted by anyone with the GPUs — but the hosted API is the cheapest path and not everyone can use it.

Capabilities

V4-Pro is strongest at math, logic, and chain-of-thought reasoning, which has been DeepSeek's calling card since R1 in early 2025. Coding is competitive with GPT-5.4 on most public benchmarks, especially on algorithmic and competition-style problems. The model emits long reasoning traces by default — useful for inspection, expensive in output tokens if you forget to bound them.

Where it trails the western frontier: nuanced writing voice, tool-use polish, and instruction-following on edge cases. UX around the API (rate limits, observability, SLAs) is also less mature than OpenAI or Anthropic.

Typical use cases

  • Math, logic, and quantitative reasoning at scale
  • Code generation and competitive-programming-style tasks
  • Cost-sensitive RAG and document QA at 1M context
  • Open-weight deployments behind a corporate firewall
  • Experimentation, prototyping, and personal projects

Sibling and rival comparison

ModelInput / 1MOutput / 1MContext
DeepSeek V4-Pro$0.435$0.871M
DeepSeek V4-Flash$0.14$0.281M
Gemini 3.5 Flash$1.50$91M
GPT-5.5$5$301M

V4-Flash undercuts even V4-Pro for routine work. After I/O 2026, Gemini 3.5 Flash is no longer in the same budget tier as V4-Pro — the closest non-Chinese rivals on price are now Gemini 3.1 Flash-Lite at $0.25/$1.50 and GPT-5.4 mini at $0.25/$2. Against GPT-5.5, V4-Pro is the cheapest 1M-context reasoning option in this comparison by a wide margin.

← See all DeepSeek plans