Performance
117
tokens/sec
Faster than 61% of models
4.63
seconds
Faster than 17% of models
4.63
seconds
Faster than 48% of models
Market Median
94 tok/s
25% faster
Median TTFT
1.10s
319% slower
Speed Comparison
Mistral: Mistral Medium 3.5
117 tok/s+0%
GLM-4.7 (Non-reasoning)
117 tok/s+0%
Z.ai: GLM 4.7
116 tok/s-1%
Benchmarks
MMLU-ProNot evaluated
GPQA Diamond
63.6%
HLE
6.0%
LiveCodeBenchNot evaluated
SciCode
28.4%
TerminalBench Hard
10.6%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
43.1%
Long Context Recall
25.7%
Tau2
79.5%
Market AverageTop Score