DeepSeek R1

deepseek-ai · DeepSeek-R1 family · released 2025-01-20 · mit license

Open frontier reasoning model. Massive (671B/37B-active); realistically multi-GPU or heavy offload, or a low-bit GGUF.

Key specs

Type	Local open-weight
Parameters	684.53B total · MoE, 37B active
Architecture	deepseek_v3
Context window	164K tokens
Knowledge cutoff	2024-10-01
Modalities	text
Recommended backends	—
Minimum viable rig	Multi-GPU / heavy offload (FP8 weights ~700GB class)

Benchmark scores

GPQA Diamond	71.5%
SWE-bench Verified	49.2%
AIME	79.8%
MMLU-Pro	84%
BFCL v3 (tool use)	—
Composite score	6.9
Community rating	5.0★ (1 reviews, 0 net votes)

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	398.5 GB	397 GB	448 GB	164K

API pricing (per 1M tokens)

Provider	Input	Output	Free tier
OpenRouter	$0.5	$2.18	Yes
DeepSeek	$0.55	$2.19	No
Together AI	$3	$7	No

Strengths & weaknesses

Strengths: SOTA open reasoning (AIME 2024 79.8, MATH-500 97.3); Only 37B of 671B active per token (+ MLA KV compression) — cheap to serve for its class; Permissive MIT license incl. distillation

Weaknesses: Prompt-sensitive; can emit empty think blocks; Weaker factuality (SimpleQA 30.1); Very large — needs multi-GPU or heavy offload