Meta Llama 3.1 70B Instruct Quantized.W4a16

RedHatAI · released 2024-07-31 · llama3.1 license

Meta Llama 3.1 70B Instruct Quantized.W4a16 by RedHatAI — 70.55B open-weight local LLM. License: llama3.1. Benchmarks, VRAM requirements per quantization, GPU compatibility, pricing and community reviews on slopsome.com.

Key specs

Type	Local open-weight
Parameters	70.55B total
Architecture	llama
Context window	131K tokens
Knowledge cutoff	—
Modalities	text
Recommended backends	—
Minimum viable rig	—

Benchmark scores

GPQA Diamond	—
SWE-bench Verified	—
AIME	—
MMLU-Pro	—
BFCL v3 (tool use)	—
Composite score	—
Community rating	No reviews yet

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	42.4 GB	40.9 GB	—	131K