Benchmark
xAI — reported benchmarks
xAI · source page ↗ · last checked Jul 3, 2026, 6:02 AM
Reported benchmarks · Grok 4.3
captured Jul 3, 2026, 6:02 AM| Benchmark | Score |
|---|---|
| GPQA Diamond | 90.1% · Graduate-level science reasoning; from Artificial Analysis and multiple sources |
| Tau-Bench (τ²-Bench) | 97.7% · Tool-use and agentic benchmark |
| GDPval-AA | 1500 Elo · Agentic task performance; xAI-reported improvement of 321 points from Grok 4.20 |
| Artificial Analysis Intelligence Index | 38 index · High reasoning mode on v4.1; composite of 9 evaluations |
| SciCode | 47.3% · Code generation and problem-solving |
Vendor-reported via automated web search — not independently verified. See the cited matrix on /models.
In the news · xAI
- NVIDIA's chips. xAI's lease. Apollo's paper. And the wealth-channel fund holding $621 million of it: Apollo's ADS Dissected
- SpaceX offers half-price Starlink in Tennessee as xAI faces a Clean Air Act lawsuit
- ICYMI: xAI debuts Grok Voice Agent Builder for Enterprises
- Rehabilitation work of dams launched in Xai-Xai
- SpaceX's Cursor Bet Tests AI Model Neutrality Post-Acquisition
- Elon Musk's xAI Unveils No-Code Tool to Build AI Call Centers Capable of Cloning Human Voices
- SpaceX offers half-price Starlink in Memphis amid backlash over xAI data centre
- xAI has released 'Voice Agent Builder,' a tool that allows users to create an AI call center with a cloned human voice without coding.
Importance-filtered press coverage (Google News) mentioning xAI. Headlines link to the original; verify before acting.
Change history
- Vendor claim Jul 3, 2026, 6:02 AM
xAI reported benchmarks updated
Grok 4.3: 5 benchmark claims (via web search)
- Vendor claim Jun 25, 2026, 6:53 PM
xAI reported benchmarks updated
Grok 4: 3 benchmark claims (via web search)