Back to Explore
Hermes 4 - Llama-3.1 405B (Non-reasoning)
Nous Research·Released 2025-08-27
Open Source
Related Models
Hermes 4 - Llama-3.1 70B (Reasoning)2025-08-27Hermes 4 - Llama-3.1 405B (Reasoning)2025-08-27Hermes 4 - Llama-3.1 70B (Non-reasoning)2025-08-27Nous: Hermes 4 405B2025-08-26Nous: Hermes 4 70B2025-08-26Nous: Hermes 3 70B Instruct2024-08-18Nous: Hermes 3 405B Instruct2024-08-16Nous: Hermes 3 405B Instruct (free)2024-08-16
Pricing
Input
$1.00
per 1M tokens
Output
$3.00
per 1M tokens
Blended
$1.50
per 1M tokens
Cheaper than 29% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$1.50
Monthly
$45.00
vs. Similar Models
Gemini 2.0 Flash-Lite (Feb '25)Q:0.0
$0.13-91%
NVIDIA Nemotron Nano 9B V2 (Reasoning)Q:0.0
$0.07-95%
Qwen3.5 2B (Non-reasoning)Q:0.0
$0.04-97%
Gemma 4 E4B (Non-reasoning)Q:+0.1
$0.54-64%
Performance
40
tokens/sec
Faster than 9% of models
0.81
seconds
Faster than 65% of models
0.81
seconds
Faster than 76% of models
Market Median
94 tok/s
58% slower
Median TTFT
1.10s
27% faster
Throughput/Dollar
27
tok/s per $/1M
Speed Comparison
Devstral 2
40 tok/s+0%
Qwen3.5 4B (Non-reasoning)
40 tok/s-0%
Devstral Small (Jul '25)
40 tok/s+0%
Benchmarks
MMLU-Pro
72.9%
GPQA Diamond
53.6%
HLE
4.2%
LiveCodeBench
54.6%
SciCode
34.6%
TerminalBench Hard
9.8%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
15.3%
IFBench
34.8%
Long Context Recall
20.0%
Tau2
26.6%
Market AverageTop Score