Google's cheap-volume tier: multimodal, 1M context, a sixth of 3.5 Flash's output price.
Flash-Lite is where Google sends the volume work. After the 3.5 Flash reprice ($1.50/$9 — no longer a budget product), the Lite tier became the actual cheap Gemini: $0.25 input and $1.50 output per 1M tokens, with the full 1M context window and multimodal input intact. That window is the differentiator — every rival in this price class except DeepSeek cuts context to save cost; Flash-Lite doesn't.
One generation note, since the price list still carries both: the older Gemini 2.5 Flash-Lite remains available at $0.10/$0.40 — less than half the price of 3.1 Flash-Lite, for a noticeably weaker model. If your task is trivial and purely cost-driven, the 2.5 generation is the cheaper button; for anything where quality registers, the current generation is worth the difference.
Flash-Lite handles text, images, audio, and video input — the full Gemini multimodal surface — at a price where rivals are text-first. It's the right default for high-volume summarization, media tagging, and document pipelines on Google infrastructure.
The honest weakness: reasoning depth. It's a volume model, not a thinking model — escalate to 3.5 Flash or 3.1 Pro when tasks need multi-step logic.
| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | 1M |
| Gemini 2.5 Flash-Lite (legacy) | $0.10 | $0.40 | 1M |
| Gemini 3.5 Flash | $1.50 | $9 | 1M |
| GPT-5.4 mini | $0.25 | $2 | 272K |
| DeepSeek V4-Flash | $0.14 | $0.28 | 1M |
Against GPT-5.4 mini it's the same input price with cheaper output and 4× the window. DeepSeek V4-Flash is cheaper still if Chinese-hosted infrastructure isn't a blocker. And if every cent counts, the legacy 2.5 Flash-Lite at $0.10/$0.40 is still the lowest Gemini sticker — just know you're buying the previous generation.