Related Models
Hermes 4 - Llama-3.1 405B (Reasoning)2025-08-27Hermes 4 - Llama-3.1 405B (Non-reasoning)2025-08-27Hermes 4 - Llama-3.1 70B (Non-reasoning)2025-08-27Nous: Hermes 4 405B2025-08-26Nous: Hermes 4 70B2025-08-26Nous: Hermes 3 70B Instruct2024-08-18Nous: Hermes 3 405B Instruct2024-08-16Nous: Hermes 3 405B Instruct (free)2024-08-16
Pricing
Input
$0.13
per 1M tokens
Output
$0.40
per 1M tokens
Blended
$0.20
per 1M tokens
Cheaper than 70% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.20
Monthly
$5.94
vs. Similar Models
Qwen: Qwen3 VL 30B A3B InstructQ:0.0
$0.23+15%
Meta: Llama 4 ScoutQ:0.0
$0.15-24%
Qwen3 14B (Reasoning)Q:+0.1
$0.73+269%
Claude 3.5 Sonnet (Oct '24)Q:-0.1
$6.00+2930%
Performance
92
tokens/sec
Faster than 49% of models
0.61
seconds
Faster than 76% of models
22.28
seconds
Faster than 22% of models
Market Median
94 tok/s
1% slower
Median TTFT
1.10s
44% faster
Throughput/Dollar
466
tok/s per $/1M
Speed Comparison
MiMo-V2-Flash (Non-reasoning)
92 tok/s-0%
Reka Flash 3
93 tok/s+1%
OpenAI: GPT-5.1
94 tok/s+1%
Benchmarks
MMLU-Pro
81.1%
GPQA Diamond
69.9%
HLE
7.9%
LiveCodeBench
65.3%
SciCode
34.1%
TerminalBench Hard
4.5%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
68.7%
IFBench
31.3%
Long Context Recall
6.7%
Tau2
22.5%
Market AverageTop Score