AIME 2024

American Invitational Mathematics Examination 2024

mathScore: 0-100 (% correct)12 models scored

About

15 challenging math competition problems from AIME 2024, used as a difficult math reasoning benchmark for frontier models

Methodology

15 problems requiring significant mathematical insight and multi-step reasoning. Answers are integers from 000 to 999. Problems span algebra, geometry, number theory, and combinatorics.

Dataset Website

Model Leaderboard

Shows open-weight models only. Commercial API models (GPT-4o, Claude, Gemini) are not submitted to the Open LLM Leaderboard — their scores come from provider-reported benchmarks.

#	Model	Score
1	o4-mini	93.4%
2	Grok 3	93.3%
3	Gemini 2.5 Pro	92.0%
4	o3	91.6%
5	Gemini 2.5 Flash	88.0%
6	o3 Mini