Performance
188
tokens/sec
Faster than 87% of models
0.18
seconds
Faster than 99% of models
10.82
seconds
Faster than 38% of models
Market Median
92 tok/s
105% faster
Median TTFT
1.13s
84% faster
Speed Comparison
OpenAI: GPT-5.1-Codex
187 tok/s-1%
Gemini 3 Flash Preview (Non-reasoning)
191 tok/s+1%
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
185 tok/s-2%
Benchmarks
MMLU-ProNot evaluated
GPQA Diamond
75.7%
HLE
9.9%
LiveCodeBenchNot evaluated
SciCode
38.2%
TerminalBench Hard
31.1%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
57.6%
Long Context Recall
32.3%
Tau2
37.4%
Market AverageTop Score