Mistral's code specialist: fill-in-the-middle completion at a fraction of frontier rates, popular as an IDE backend.
Codestral is a specialist: a model trained for code that supports fill-in-the-middle — completing code where your cursor is, with file context on both sides — which chat-tuned frontier models handle awkwardly. At $0.30 input and $0.90 output per 1M tokens, it's priced for keystroke-frequency calls: an IDE firing dozens of completions per minute stays affordable.
That's a different job than agentic coding. Claude and GPT models plan, refactor, and run tools; Codestral completes and generates. Plenty of teams run both — a frontier model for the hard work, Codestral as the always-on completion layer.
Strong code generation and completion across mainstream languages, a 256K window that fits real repository context, and latency suited to interactive use. The open-weights heritage matters here too: self-hostable variants exist for teams that can't send code to any external API.
The honest weakness: it's not a reasoner. Architecture decisions, subtle multi-file refactors, and debugging-by-deduction belong to Opus-class models or agent loops built on bigger brains.
| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
| Codestral | $0.30 | $0.90 | 256K |
| Grok Code Fast 1 | $0.20 | $1.50 | 256K |
| Claude Sonnet 4.6 | $3 | $15 | 200K (1M β) |
| GPT-5.4 | $2.50 | $15 | 272K |
| Mistral Large 3 | $2 | $6 | 256K |
The direct rival is Grok Code Fast 1 — cheaper input, pricier output, no FIM pedigree. The frontier coding models (Sonnet, GPT-5.4) cost 8–17× more and earn it on hard tasks. The split that works: Codestral for completion volume, a frontier model for thinking.