o4-mini thinks before it speaks — literally. It uses extended internal reasoning to work through problems step by step, making it notably strong at math, logic, and code where careful deliberation pays off. It's a compact reasoning model, so it trades some breadth and general knowledge depth for focused analytical horsepower.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| SWE-Bench | 68.1 | accuracy | 5d ago |
| AIME 2024 | 93.4 | accuracy | 5d ago |
| AIME 2025 | 92.7 | accuracy | 5d ago |
| Aider Polyglot | 72.0 | accuracy | 5d ago |
| TAU2 | 49.2 | accuracy | 5d ago |