Mistral Small 3.2

Mistral · released 2025-06-20 · Apache-2.0 license

A fast, cheap 24B with vision and great tool-calling. GQA (8 KV heads) keeps the KV cache small, so long context fits comfortably.

Key specs

Quant	VRAM	Disk	RAM	Context
Q4_K_M	15 GB	14 GB	24 GB	131K
Q8	26 GB	25 GB	40 GB	131K
FP16	49 GB	48 GB	64 GB	131K

Provider	Input	Output	Free tier
Mistral	$0.1	$0.3	Yes
Together AI	$0.2	$0.6	No

Strengths: Very fast & cheap to run; Strong function-calling; Native vision

Weaknesses: Weaker at the hardest reasoning; Smaller knowledge base