Gemma 2 27B
Google · Gemma 2 family · released 2024-06-01 · Gemma license
A strong, efficient 2024 dense model — great on one GPU, but its 8k context is small by today's standards.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 27.2B total |
| Architecture | Dense decoder-only with interleaved local-global attention (4k sliding window), GQA, logit soft-capping |
| Context window | 8K tokens |
| Knowledge cutoff | — |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | ~16GB VRAM at 4-bit (~54GB BF16) |
Benchmark scores
| GPQA Diamond | — |
|---|---|
| SWE-bench Verified | — |
| AIME | — |
| MMLU-Pro | — |
| BFCL v3 (tool use) | — |
| Composite score | — |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 17 GB | 16 GB | 28 GB | 8K |
Strengths & weaknesses
Strengths: Strong size-class quality; beat Llama 3 70B on LMSYS Arena Elo at release; Runs comfortably on a single GPU; Broad tooling support
Weaknesses: Short 8k native context; English-only output; No reasoning mode or vision (Gemma 3+ features)