Related Models
Performance
40
tokens/sec
Faster than 9% of models
0.73
seconds
Faster than 69% of models
0.73
seconds
Faster than 78% of models
Market Median
95 tok/s
58% slower
Median TTFT
1.11s
34% faster
Speed Comparison
Microsoft: Phi 4
40 tok/s-0%
Qwen3.5 4B (Non-reasoning)
40 tok/s-1%
Claude 4.1 Opus (Reasoning)
41 tok/s+2%
Benchmarks
MMLU-Pro
76.2%
GPQA Diamond
59.4%
HLE
3.6%
LiveCodeBench
44.8%
SciCode
33.1%
TerminalBench Hard
18.9%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
36.7%
IFBench
38.1%
Long Context Recall
30.0%
Tau2
24.9%
Market AverageTop Score