DeepSeek R1
deepseek-ai · DeepSeek-R1 family · released 2025-01-20 · mit license
Open frontier reasoning model. Massive (671B/37B-active); realistically multi-GPU or heavy offload, or a low-bit GGUF.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 684.53B total · MoE, 37B active |
| Architecture | deepseek_v3 |
| Context window | 164K tokens |
| Knowledge cutoff | 2024-10-01 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | Multi-GPU / heavy offload (FP8 weights ~700GB class) |
Benchmark scores
| GPQA Diamond | 71.5% |
|---|---|
| SWE-bench Verified | 49.2% |
| AIME | 79.8% |
| MMLU-Pro | 84% |
| BFCL v3 (tool use) | — |
| Composite score | 6.9 |
| Community rating | 5.0★ (1 reviews, 0 net votes) |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 398.5 GB | 397 GB | 448 GB | 164K |
API pricing (per 1M tokens)
| Provider | Input | Output | Free tier |
|---|---|---|---|
| OpenRouter | $0.5 | $2.18 | Yes |
| DeepSeek | $0.55 | $2.19 | No |
| Together AI | $3 | $7 | No |
Strengths & weaknesses
Strengths: SOTA open reasoning (AIME 2024 79.8, MATH-500 97.3); Only 37B of 671B active per token (+ MLA KV compression) — cheap to serve for its class; Permissive MIT license incl. distillation
Weaknesses: Prompt-sensitive; can emit empty think blocks; Weaker factuality (SimpleQA 30.1); Very large — needs multi-GPU or heavy offload