Qwen3 30B A3B

Qwen · Qwen3 family · released 2025-04-27 · apache-2.0 license

A tiny-active (3.3B) MoE that reasons well above its weight. Fits a 24GB card at Q4; toggle thinking mode for hard math/code.

Key specs

TypeLocal open-weight
Parameters30.53B total · MoE, 3.3B active
Architectureqwen3_moe
Context window41K tokens
Knowledge cutoff
Modalitiestext
Recommended backends
Minimum viable rig

Benchmark scores

GPQA Diamond65.8%
SWE-bench Verified
AIME70.9%
MMLU-Pro78.5%
BFCL v3 (tool use)69.1%
Composite score7.26
Community rating2.5★ (2 reviews, 2 net votes)

VRAM & disk per quantization

QuantVRAMDiskRAMContext
Q4_K_M19.2 GB17.7 GB32 GB41K

Strengths & weaknesses

Strengths: Only 3.3B active params — very efficient for its quality; Strong reasoning/coding in thinking mode for the active size; Hybrid thinking / non-thinking switch

Weaknesses: Weaker complex multi-step agentic tool-use vs frontier; Static YaRN can degrade short-context if force-enabled; No vision