Qwen3 30B A3B

Qwen · Qwen3 family · released 2025-04-27 · apache-2.0 license

A tiny-active (3.3B) MoE that reasons well above its weight. Fits a 24GB card at Q4; toggle thinking mode for hard math/code.

Key specs

Type	Local open-weight
Parameters	30.53B total · MoE, 3.3B active
Architecture	qwen3_moe
Context window	41K tokens
Knowledge cutoff	—
Modalities	text
Recommended backends	—
Minimum viable rig	—

Benchmark scores

GPQA Diamond	65.8%
SWE-bench Verified	—
AIME	70.9%
MMLU-Pro	78.5%
BFCL v3 (tool use)	69.1%
Composite score	7.26
Community rating	2.5★ (2 reviews, 2 net votes)

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	19.2 GB	17.7 GB	32 GB	41K

Strengths & weaknesses

Strengths: Only 3.3B active params — very efficient for its quality; Strong reasoning/coding in thinking mode for the active size; Hybrid thinking / non-thinking switch

Weaknesses: Weaker complex multi-step agentic tool-use vs frontier; Static YaRN can degrade short-context if force-enabled; No vision