← Back to API models

Gemini 3.1 Pro

Name: Gemini 3.1 Pro
Brand: Google DeepMind
Price: 2 USD

by Google DeepMind · frontier multimodal · in AI Pro $19.99

Frontier multimodal at a discount. Cheapest top-tier model below 200K tokens — tiered above.

Input $2 / 1M

Output $12 / 1M

Context 1M tokens

Above 200K $4 / $18

Google AI Studio ↗ Updated July 9, 2026

§ API pricing

Per-token rates, with a tier break at 200K.

Input ≤200K

$2/1M tokens

Standard tier

Cheapest frontier-class input
40% of Opus 4.8's input rate
Vision and audio billed as input tokens

Output ≤200K

$12/1M tokens

Standard tier

Cheapest frontier-class output
40% of GPT-5.5 output
Includes thinking tokens

Above 200K

$4/$18 per 1M

Long-context tier

Kicks in once your prompt crosses 200K
Applies to the full request, not the overage
Still cheaper than GPT-5.5 at length

Context

1Mtokens

Window

Native multimodal: text, image, audio, video
Legacy Gemini 2.5 Pro: 2M window at $1.25/$10
Strong long-context recall in Google evals

Why Gemini 3.1 Pro is the value play

For short and medium prompts, Gemini 3.1 Pro is the cheapest way to access frontier-class quality. At $2 input and $12 output per 1M tokens, it undercuts GPT-5.5 by roughly 60% and Opus 4.8 by about half. That price-to-quality ratio is why it sits inside the AI Pro consumer plan at $19.99/mo and why it shows up so often in cost-sensitive production stacks.

The catch is the tier break. Once your request crosses 200K tokens, billing jumps to $4 input and $18 output per 1M for the entire call. That still beats GPT-5.5's flat $5/$30, but if you do a lot of long-context work the gap narrows. Engineer your prompts to stay under 200K when you can — the savings are real.

One generation note, since both remain on the price list: the older Gemini 2.5 Pro is still sold at $1.25/$10 per 1M with a 2M-token window — cheaper than 3.1 Pro and with twice the context, but a generation behind on reasoning and multimodal quality. Pick it only when raw window size matters more than model quality; otherwise the current generation is the better spend.

Capabilities

Gemini 3.1 Pro is natively multimodal in a way the GPT and Claude families are not — it handles audio and video input as first-class citizens, alongside text and images. That makes it the default choice for transcription pipelines, video analysis, and any workflow that wants one model to read everything in a folder. Reasoning is competitive with GPT-5.4 on most benchmarks and improving fast.

The weakness is consistency. Gemini still shows more variance call-to-call than Opus or GPT-5.5, and the UX around tool use is less mature. For mission-critical agent loops, most teams pair Gemini with a stricter validator.

Typical use cases

Cost-sensitive production chat and content generation
Video and audio analysis (native multimodal input)
Long-document workflows up to 1M tokens
Workspace-adjacent automations (Gmail, Docs, Drive)
NotebookLM-style research synthesis

Sibling and rival comparison

Model	Input / 1M	Output / 1M	Context
Gemini 3.1 Pro (≤200K)	$2	$12	1M
Gemini 3.1 Pro (>200K)	$4	$18	1M
Gemini 3.5 Flash	$1.50	$9	1M
GPT-5.5	$5	$30	1M

Versus GPT-5.5 (the closest cross-family flagship), Gemini 3.1 Pro is meaningfully cheaper at every length. Versus its own Flash sibling, the gap closed after I/O 2026: Pro is only about 33% more expensive than 3.5 Flash, and Flash actually beats Pro on coding — so reach for Pro when you need the hardest reasoning or long-horizon planning, not for code.

← See all Google / Gemini plans