Related Models
Pricing
Input
$0.11
per 1M tokens
Output
$0.42
per 1M tokens
Blended
$0.19
per 1M tokens
Cheaper than 71% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.19
Monthly
$5.64
vs. Similar Models
Apertus 8B InstructQ:0.0
$0.13-34%
Gemma 3 4B InstructQ:+0.1
$0.05-73%
Llama 3.2 Instruct 1BQ:+0.1
$0.05-73%
Gemma 3n E4B InstructQ:+0.2
$0.03-87%
Performance
222
tokens/sec
Faster than 91% of models
0.91
seconds
Faster than 61% of models
0.91
seconds
Faster than 73% of models
Market Median
94 tok/s
137% faster
Median TTFT
1.10s
18% faster
Throughput/Dollar
1180
tok/s per $/1M
Speed Comparison
OpenAI: o3 Mini
221 tok/s-0%
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)
220 tok/s-1%
Gemini 2.5 Flash (Reasoning)
224 tok/s+1%
Benchmarks
MMLU-Pro
23.1%
GPQA Diamond
23.1%
HLE
5.2%
LiveCodeBench
7.3%
SciCode
4.1%
TerminalBench Hard
0.0%
MATH-500
52.1%
AIME
1.7%
AIME 2025
10.3%
IFBench
21.9%
Long Context Recall
0.0%
Tau2
14.6%
Market AverageTop Score