Qwen3 32B

Qwen · released 2025-04-27 · apache-2.0 license

A sweet-spot 32B you can run on a single 24GB card at Q4. Strong at coding and math; keep prompts under ~64k for best quality.

Key specs

Quant	VRAM	Disk	RAM	Context
Q8	35 GB	34 GB	48 GB	131K
FP16	66 GB	64 GB	80 GB	131K
Q4_K_M	20.5 GB	19 GB	32 GB	41K

Provider	Input	Output	Free tier
SiliconFlow	$0.1	$0.3	No

Strengths: Strong reasoning & math for a 32B; Toggleable thinking mode; Runs on a single 24GB card at Q4

Weaknesses: Context quality thins past ~64k; No vision