Related Models
Performance
143
tokens/sec
Faster than 70% of models
1.13
seconds
Faster than 49% of models
15.16
seconds
Faster than 32% of models
Market Median
94 tok/s
51% faster
Median TTFT
1.11s
1% slower
Speed Comparison
GPT-5 nano (medium)
142 tok/s-0%
Google: Gemini 2.5 Pro
142 tok/s-0%
Qwen: Qwen3 VL 8B Instruct
143 tok/s+0%
Benchmarks
MMLU-Pro
69.6%
GPQA Diamond
41.6%
HLE
3.3%
LiveCodeBench
29.5%
SciCode
17.8%
TerminalBench Hard
2.3%
MATH-500
84.7%
AIME
20.3%
AIME 2025Not evaluated
IFBench
31.8%
Long Context Recall
0.0%
Tau2
0.0%
Market AverageTop Score