Benchmarks
MMLU-Pro
37.1%
GPQA Diamond
24.0%
HLE
5.1%
LiveCodeBench
3.9%
SciCode
3.6%
TerminalBench Hard
0.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
0.0%
IFBench
19.7%
Long Context Recall
0.0%
Tau2
0.0%
Market AverageTop Score
Quality Index
3.8
472nd of 537
Top 88%
Coding Index
1.2
434th of 447
Top 98%
Math Index
0.0
266th of 269