Gemini 3.1 Flash-Lite

Name: Gemini 3.1 Flash-Lite
Brand: Google
Price: 0.25 USD

by Google · budget tier · Gemini 3 family (superseded)

Google's earlier cheap-volume tier. Now superseded by Gemini 3.5 Flash-Lite (July 21, 2026), which brings significantly better quality at ~350 tokens/second.

Input $0.25 / 1M

Output $1.50 / 1M

Context 1M tokens

Family Gemini 3

Google AI Studio ↗ Updated July 23, 2026

Gemini 3.1 Flash-Lite has been superseded by Gemini 3.5 Flash-Lite, released July 21, 2026 with significantly better quality at ~350 tokens/second. This page is kept for reference.

§ API pricing

Per-token rates.

Input

$0.25/1M tokens

Prompt tokens

A sixth of 3.5 Flash's rate
Matches GPT-5.4 mini
Multimodal inputs billed as tokens

Output

$1.50/1M tokens

Completion tokens

A sixth of 3.5 Flash's $9
Under GPT-5.4 mini's $2
Cheapest current-gen Gemini

Context

1Mtokens

Window

Full-size window at budget price
4× GPT-5.4 mini's 272K
Whole codebases in one call

Legacy

$0.10 / $0.40Gemini 2.5 Flash-Lite

Older generation

2.5 Flash-Lite still sold, cheaper
Weaker model, same 1M window
Pick it only on pure price

Why Flash-Lite exists

Flash-Lite is where Google sends the volume work. After the 3.5 Flash reprice ($1.50/$9 — no longer a budget product), the Lite tier became the actual cheap Gemini: $0.25 input and $1.50 output per 1M tokens, with the full 1M context window and multimodal input intact. That window is the differentiator — every rival in this price class except DeepSeek cuts context to save cost; Flash-Lite doesn't.

One generation note, since the price list still carries both: the older Gemini 2.5 Flash-Lite remains available at $0.10/$0.40 — less than half the price of 3.1 Flash-Lite, for a noticeably weaker model. If your task is trivial and purely cost-driven, the 2.5 generation is the cheaper button; for anything where quality registers, the current generation is worth the difference.

Capabilities

Flash-Lite handles text, images, audio, and video input — the full Gemini multimodal surface — at a price where rivals are text-first. It's the right default for high-volume summarization, media tagging, and document pipelines on Google infrastructure.

The honest weakness: reasoning depth. It's a volume model, not a thinking model — escalate to 3.5 Flash or 3.1 Pro when tasks need multi-step logic.

Typical use cases

High-volume multimodal tagging: images, audio, video
Summarization pipelines over long documents (1M window)
Free-tier and high-traffic chat products
Classification and routing on Google Cloud stacks
Bulk transcription cleanup and formatting

Sibling and rival comparison

Model	Input / 1M	Output / 1M	Context
Gemini 3.1 Flash-Lite	$0.25	$1.50	1M
Gemini 2.5 Flash-Lite (legacy)	$0.10	$0.40	1M
Gemini 3.5 Flash	$1.50	$9	1M
GPT-5.4 mini	$0.25	$2	272K
DeepSeek V4-Flash	$0.14	$0.28	1M

Against GPT-5.4 mini it's the same input price with cheaper output and 4× the window. DeepSeek V4-Flash is cheaper still if Chinese-hosted infrastructure isn't a blocker. And if every cent counts, the legacy 2.5 Flash-Lite at $0.10/$0.40 is still the lowest Gemini sticker — just know you're buying the previous generation.

← See all Google / Gemini plans