Phi 4

microsoft · released 2024-12-11 · mit license

A 14B that runs on a 12GB card and excels at math/reasoning. Short 16k context is the main limitation.

Key specs

TypeLocal open-weight
Parameters14.66B total
Architecturephi3
Context window16K tokens
Knowledge cutoff2024-06-01
Modalitiestext
Recommended backendsllama.cpp, Ollama, vLLM
Minimum viable rigRTX 3060 12GB at Q4_K_M

Benchmark scores

GPQA Diamond56%
SWE-bench Verified28%
AIME75%
MMLU-Pro71%
BFCL v3 (tool use)50%
Composite score5.16
Community ratingNo reviews yet

VRAM & disk per quantization

QuantVRAMDiskRAMContext
Q816 GB15 GB24 GB16K
FP1629 GB28 GB40 GB16K
Q4_K_M10 GB8.5 GB16 GB16K

Strengths & weaknesses

Strengths: Punches above its weight on math; Tiny VRAM footprint

Weaknesses: Only 16k native context; Weaker multilingual