LLaDA2.1 Flash

inclusionAI · released 2026-02-09 · apache-2.0 license

LLaDA2.1 Flash by inclusionAI — 102.89B open-weight local LLM. License: apache-2.0. Benchmarks, VRAM requirements per quantization, GPU compatibility, pricing and community reviews on slopsome.com.

Key specs

Type	Local open-weight
Parameters	102.89B total · MoE, — active
Architecture	llada2_moe
Context window	33K tokens
Knowledge cutoff	—
Modalities	text
Recommended backends	—
Minimum viable rig	—

Benchmark scores

GPQA Diamond	—
SWE-bench Verified	—
AIME	—
MMLU-Pro	—
BFCL v3 (tool use)	—
Composite score	—
Community rating	No reviews yet

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	61.2 GB	59.7 GB	—	33K