Mistral's volume workhorse — the cheapest European-hosted general model on our ledger, with an open-weights sibling.
Small 3.1 is Mistral's answer for the 90% of API traffic that doesn't need a flagship: $0.20 input and $0.60 output per 1M tokens, undercutting GPT-5.4 mini on both rates. For teams with EU data-residency requirements, it's often the only model in this price class that ticks the compliance box without a US or Chinese provider in the loop.
The other thing no rival here offers: an open-weights sibling. If your volume grows to where per-token pricing hurts, you can move the same family onto your own GPUs — the API becomes a prototyping stage rather than a permanent bill.
Small 3.1 is a competent generalist with vision support: classification, extraction, summarization, routine drafting, and solid multilingual coverage across European languages. Tool calling works, simple agent loops work.
The honest weakness: the 128K window is the smallest in its class, and hard reasoning is out of scope — that's Large 3 territory, or a different provider entirely.
| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
| Mistral Small 3.1 | $0.20 | $0.60 | 128K |
| Mistral Large 3 | $2 | $6 | 256K |
| GPT-5.4 mini | $0.25 | $2 | 272K |
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | 1M |
| DeepSeek V4-Flash | $0.14 | $0.28 | 1M |
On pure price only DeepSeek V4-Flash beats it, and that means Chinese-hosted infrastructure. Against GPT-5.4 mini and Gemini Flash-Lite, Small trades context window for cheaper output and EU jurisdiction. If sovereignty matters, this is the budget pick; if window size matters, look at the 1M rivals.