Pricing
Input
$2.00
per 1M tokens
Output
$5.00
per 1M tokens
Blended
$2.75
per 1M tokens
Cheaper than 23% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$2.75
Monthly
$82.50
vs. Similar Models
Qwen3 VL 32B (Reasoning)Q:0.0
$2.63-5%
Nova 2.0 Lite (low)Q:-0.1
$0.85-69%
Perplexity: Sonar Reasoning ProQ:-0.1
$3.50+27%
Qwen3 Coder 480B A35B InstructQ:+0.1
$0.68-75%
Performance
39
tokens/sec
Faster than 7% of models
0.59
seconds
Faster than 78% of models
52.24
seconds
Faster than 6% of models
Market Median
94 tok/s
59% slower
Median TTFT
1.11s
47% faster
Throughput/Dollar
14
tok/s per $/1M
Speed Comparison
Hermes 4 - Llama-3.1 405B (Non-reasoning)
39 tok/s+0%
OpenAI: GPT-4 Turbo
38 tok/s-1%
Qwen3.6 Max Preview
38 tok/s-1%
Benchmarks
MMLU-Pro
81.5%
GPQA Diamond
73.9%
HLE
9.6%
LiveCodeBench
75.0%
SciCode
39.2%
TerminalBench Hard
12.9%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
82.0%
IFBench
43.0%
Long Context Recall
51.3%
Tau2
52.0%
Market AverageTop Score