Related Models
Hermes 4 - Llama-3.1 70B (Reasoning)2025-08-27Hermes 4 - Llama-3.1 405B (Reasoning)2025-08-27Hermes 4 - Llama-3.1 405B (Non-reasoning)2025-08-27Nous: Hermes 4 405B2025-08-26Nous: Hermes 4 70B2025-08-26Nous: Hermes 3 70B Instruct2024-08-18Nous: Hermes 3 405B Instruct2024-08-16Nous: Hermes 3 405B Instruct (free)2024-08-16
Pricing
Input
$0.13
per 1M tokens
Output
$0.40
per 1M tokens
Blended
$0.20
per 1M tokens
Cheaper than 70% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.20
Monthly
$5.94
vs. Similar Models
Google: Gemini 2.5 Flash LiteQ:0.0
$0.17-12%
Mistral Small 3Q:0.0
$0.10-47%
Nova LiteQ:0.0
$0.10-47%
OpenAI: GPT-4o-miniQ:0.0
$0.26+33%
Performance
90
tokens/sec
Faster than 48% of models
0.62
seconds
Faster than 75% of models
0.62
seconds
Faster than 83% of models
Market Median
94 tok/s
3% slower
Median TTFT
1.10s
44% faster
Throughput/Dollar
457
tok/s per $/1M
Speed Comparison
Qwen3 32B (Non-reasoning)
91 tok/s+0%
Qwen3.5 27B (Non-reasoning)
90 tok/s-0%
Grok 4 Fast (Reasoning)
90 tok/s-1%
Benchmarks
MMLU-Pro
66.4%
GPQA Diamond
49.1%
HLE
3.6%
LiveCodeBench
26.9%
SciCode
27.7%
TerminalBench Hard
0.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
11.3%
IFBench
29.0%
Long Context Recall
2.0%
Tau2
21.6%
Market AverageTop Score