Skip to main content

Scientific Reasoning Leaderboard

Models ranked by GPQA Diamond, based on independent benchmark evaluations.

Each row shows the model's benchmark score alongside its pricing and output speed, so you can evaluate quality-to-cost tradeoffs at a glance.

All Leaderboards

Scientific Reasoning Leaderboard

Top 20 models ranked by gpqa diamond

RankModelGPQA Diamond
🥇Google0.9
🥈OpenAI0.9
🥉OpenAI0.9
4MiniMax0.9
5Anthropic0.9
6OpenAI0.9
7Alibaba0.9
8Google0.9
9Google0.9
10Anthropic0.9
11OpenAI0.9
12OpenAI0.9
13Anthropic0.9
14Kimi0.9
15xAI0.9
16OpenAI0.9
17Google0.9
18DeepSeek0.9
19OpenAI0.9
20xAI0.9