← Back to API models

Gemini 3.1 Flash-Lite

by Google · budget tier · Gemini 3 family

Google's cheap-volume tier: multimodal, 1M context, a sixth of 3.5 Flash's output price.

Input $0.25 / 1M
Output $1.50 / 1M
Context 1M tokens
Family Gemini 3
Google AI Studio ↗ Updated June 10, 2026
§ API pricing

Per-token rates.

Input
$0.25/1M tokens
Prompt tokens
  • A sixth of 3.5 Flash's rate
  • Matches GPT-5.4 mini
  • Multimodal inputs billed as tokens
Output
$1.50/1M tokens
Completion tokens
  • A sixth of 3.5 Flash's $9
  • Under GPT-5.4 mini's $2
  • Cheapest current-gen Gemini
Context
1Mtokens
Window
  • Full-size window at budget price
  • 4× GPT-5.4 mini's 272K
  • Whole codebases in one call
Legacy
$0.10 / $0.40Gemini 2.5 Flash-Lite
Older generation
  • 2.5 Flash-Lite still sold, cheaper
  • Weaker model, same 1M window
  • Pick it only on pure price

Why Flash-Lite exists

Flash-Lite is where Google sends the volume work. After the 3.5 Flash reprice ($1.50/$9 — no longer a budget product), the Lite tier became the actual cheap Gemini: $0.25 input and $1.50 output per 1M tokens, with the full 1M context window and multimodal input intact. That window is the differentiator — every rival in this price class except DeepSeek cuts context to save cost; Flash-Lite doesn't.

One generation note, since the price list still carries both: the older Gemini 2.5 Flash-Lite remains available at $0.10/$0.40 — less than half the price of 3.1 Flash-Lite, for a noticeably weaker model. If your task is trivial and purely cost-driven, the 2.5 generation is the cheaper button; for anything where quality registers, the current generation is worth the difference.

Capabilities

Flash-Lite handles text, images, audio, and video input — the full Gemini multimodal surface — at a price where rivals are text-first. It's the right default for high-volume summarization, media tagging, and document pipelines on Google infrastructure.

The honest weakness: reasoning depth. It's a volume model, not a thinking model — escalate to 3.5 Flash or 3.1 Pro when tasks need multi-step logic.

Typical use cases

  • High-volume multimodal tagging: images, audio, video
  • Summarization pipelines over long documents (1M window)
  • Free-tier and high-traffic chat products
  • Classification and routing on Google Cloud stacks
  • Bulk transcription cleanup and formatting

Sibling and rival comparison

ModelInput / 1MOutput / 1MContext
Gemini 3.1 Flash-Lite$0.25$1.501M
Gemini 2.5 Flash-Lite (legacy)$0.10$0.401M
Gemini 3.5 Flash$1.50$91M
GPT-5.4 mini$0.25$2272K
DeepSeek V4-Flash$0.14$0.281M

Against GPT-5.4 mini it's the same input price with cheaper output and 4× the window. DeepSeek V4-Flash is cheaper still if Chinese-hosted infrastructure isn't a blocker. And if every cent counts, the legacy 2.5 Flash-Lite at $0.10/$0.40 is still the lowest Gemini sticker — just know you're buying the previous generation.

← See all Google / Gemini plans