Benchmark

xAI — reported benchmarks

xAI · source page ↗ · last checked Jul 3, 2026, 6:02 AM

Reported benchmarks · Grok 4.3

captured Jul 3, 2026, 6:02 AM
BenchmarkScore
GPQA Diamond90.1% · Graduate-level science reasoning; from Artificial Analysis and multiple sources
Tau-Bench (τ²-Bench)97.7% · Tool-use and agentic benchmark
GDPval-AA1500 Elo · Agentic task performance; xAI-reported improvement of 321 points from Grok 4.20
Artificial Analysis Intelligence Index38 index · High reasoning mode on v4.1; composite of 9 evaluations
SciCode47.3% · Code generation and problem-solving

Vendor-reported via automated web search — not independently verified. See the cited matrix on /models.

In the news · xAI

Importance-filtered press coverage (Google News) mentioning xAI. Headlines link to the original; verify before acting.

Change history