Benchmark

Meta Llama — reported benchmarks

Meta · source page ↗ · last checked Jul 3, 2026, 12:02 AM

Reported benchmarks · Llama 4 Maverick

captured Jul 3, 2026, 12:02 AM

Benchmark	Score
MMLU Pro	80.5%
GPQA Diamond	69.8%
LiveCodeBench	43.4 pass@1 · averaged over multiple generations
HumanEval	86.4%
Multilingual MMLU	84.6%
GSM8K	95.2%
MATH-500	85.3%
SWE-bench Verified	74.2% pass@1

Vendor-reported via automated web search — not independently verified. See the cited matrix on /models.

Vendor claim Jul 3, 2026, 12:02 AM

Meta reported benchmarks updated

Llama 4 Maverick: 8 benchmark claims (via web search)
Vendor claim Jun 25, 2026, 4:22 PM

Meta reported benchmarks updated

Llama 3.1 405B: 8 benchmark claims (via web search)