Related Models
Performance
143
tokens/sec
Faster than 70% of models
1.13
seconds
Faster than 49% of models
15.16
seconds
Faster than 32% of models
Market Median
94 tok/s
52% faster
Median TTFT
1.10s
2% slower
Speed Comparison
Qwen: Qwen3 VL 8B Instruct
143 tok/s+0%
GPT-5 nano (medium)
142 tok/s-0%
Grok 4.3 (medium)
143 tok/s+0%
Benchmarks
MMLU-Pro
69.6%
GPQA Diamond
41.6%
HLE
3.3%
LiveCodeBench
29.5%
SciCode
17.8%
TerminalBench Hard
2.3%
MATH-500
84.7%
AIME
20.3%
AIME 2025Not evaluated
IFBench
31.8%
Long Context Recall
0.0%
Tau2
0.0%
Market AverageTop Score