Related Models
Performance
42
tokens/sec
Faster than 10% of models
0.57
seconds
Faster than 80% of models
0.57
seconds
Faster than 86% of models
Market Median
94 tok/s
55% slower
Median TTFT
1.10s
48% faster
Speed Comparison
Hermes 4 - Llama-3.1 405B (Reasoning)
42 tok/s+0%
Kimi K2.6
42 tok/s+0%
MiMo-V2-Pro
43 tok/s+1%
Benchmarks
MMLU-Pro
67.8%
GPQA Diamond
53.2%
HLE
3.4%
LiveCodeBench
34.8%
SciCode
28.8%
TerminalBench Hard
16.7%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
34.3%
IFBench
31.2%
Long Context Recall
24.0%
Tau2
23.4%
Market AverageTop Score