← Back to API models

Gemini 3.1 Pro

by Google DeepMind · frontier multimodal · in AI Pro $19.99

Frontier multimodal at a discount. Cheapest top-tier model below 200K tokens — tiered above.

Input $2 / 1M
Output $12 / 1M
Context 1M tokens
Above 200K $4 / $18
Google AI Studio ↗ Updated May 20, 2026
§ API pricing

Per-token rates, with a tier break at 200K.

Input ≤200K
$2/1M tokens
Standard tier
  • Cheapest frontier-class input
  • 40% of Opus 4.7's input rate
  • Vision and audio billed as input tokens
Output ≤200K
$12/1M tokens
Standard tier
  • Cheapest frontier-class output
  • 40% of GPT-5.5 output
  • Includes thinking tokens
Above 200K
$4/$18 per 1M
Long-context tier
  • Kicks in once your prompt crosses 200K
  • Applies to the full request, not the overage
  • Still cheaper than GPT-5.5 at length
Context
1Mtokens
Window
  • Native multimodal: text, image, audio, video
  • 2.5 Pro variant pushes to 2M
  • Strong long-context recall in Google evals

Why Gemini 3.1 Pro is the value play

For short and medium prompts, Gemini 3.1 Pro is the cheapest way to access frontier-class quality. At $2 input and $12 output per 1M tokens, it undercuts GPT-5.5 by roughly 60% and Opus 4.7 by about half. That price-to-quality ratio is why it sits inside the AI Pro consumer plan at $19.99/mo and why it shows up so often in cost-sensitive production stacks.

The catch is the tier break. Once your request crosses 200K tokens, billing jumps to $4 input and $18 output per 1M for the entire call. That still beats GPT-5.5's flat $5/$30, but if you do a lot of long-context work the gap narrows. Engineer your prompts to stay under 200K when you can — the savings are real.

Capabilities

Gemini 3.1 Pro is natively multimodal in a way the GPT and Claude families are not — it handles audio and video input as first-class citizens, alongside text and images. That makes it the default choice for transcription pipelines, video analysis, and any workflow that wants one model to read everything in a folder. Reasoning is competitive with GPT-5.4 on most benchmarks and improving fast.

The weakness is consistency. Gemini still shows more variance call-to-call than Opus or GPT-5.5, and the UX around tool use is less mature. For mission-critical agent loops, most teams pair Gemini with a stricter validator.

Typical use cases

  • Cost-sensitive production chat and content generation
  • Video and audio analysis (native multimodal input)
  • Long-document workflows up to 1M tokens
  • Workspace-adjacent automations (Gmail, Docs, Drive)
  • NotebookLM-style research synthesis

Sibling and rival comparison

ModelInput / 1MOutput / 1MContext
Gemini 3.1 Pro (≤200K)$2$121M
Gemini 3.1 Pro (>200K)$4$181M
Gemini 3.5 Flash$1.50$91M
GPT-5.5$5$301M

Versus GPT-5.5 (the closest cross-family flagship), Gemini 3.1 Pro is meaningfully cheaper at every length. Versus its own Flash sibling, the gap closed after I/O 2026: Pro is only about 33% more expensive than 3.5 Flash, and Flash actually beats Pro on coding — so reach for Pro when you need the hardest reasoning or long-horizon planning, not for code.

← See all Google / Gemini plans