Grok 4 approaches problems with a directness and willingness to engage with edgy or controversial topics that more cautious models tend to sidestep. It's built around deep reasoning capabilities, handling complex multi-step logic and technical problems with notable thoroughness. The trade-off is that its blunter personality and occasional overconfidence can surface in ways that feel less polished than more conservative counterparts.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| AIME 2025 | 91.7 | accuracy | 5d ago |
| LCR | 68.0 | pass@1_accuracy | 5d ago |
| Aider Polyglot | 79.6 | accuracy | 5d ago |