Gemma 3 27B

Google · released 2025-03-12 · Gemma license

A capable open multimodal 27B with great multilingual quality. The 16 KV-heads mean a heavier KV cache, so watch long-context VRAM.

Key specs

TypeLocal open-weight
Parameters27B total
Architecturedense
Context window131K tokens
Knowledge cutoff2024-08-01
Modalitiestext, image
Recommended backendsllama.cpp, Ollama, MLX, vLLM
Minimum viable rigRTX 3090 / 4090 (24GB) at Q4_K_M

Benchmark scores

GPQA Diamond52%
SWE-bench Verified30%
AIME40%
MMLU-Pro67%
BFCL v3 (tool use)55%
Composite score4.93
Community rating4.0★ (1 reviews, 0 net votes)

VRAM & disk per quantization

QuantVRAMDiskRAMContext
Q4_K_M17 GB16 GB28 GB131K
Q829 GB28 GB40 GB131K
FP1655 GB54 GB72 GB131K

API pricing (per 1M tokens)

ProviderInputOutputFree tier
Together AI$0.2$0.4No

Strengths & weaknesses

Strengths: Excellent multilingual coverage; Native vision; Strong instruction following

Weaknesses: Weaker at hard math; Large KV cache (16 KV heads)