Skip to main content
Back to Explore

Hermes 4 - Llama-3.1 70B (Reasoning)

Nous Research·Released 2025-08-27
Open Source

Pricing

Input

$0.13

per 1M tokens

Output

$0.40

per 1M tokens

Blended

$0.20

per 1M tokens

Cheaper than 70% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.20

Monthly

$5.94

vs. Similar Models

Qwen: Qwen3 VL 30B A3B InstructQ:0.0
$0.23+15%
Meta: Llama 4 ScoutQ:0.0
$0.15-24%
Qwen3 14B (Reasoning)Q:+0.1
$0.73+269%
Claude 3.5 Sonnet (Oct '24)Q:-0.1
$6.00+2930%

Performance

92

tokens/sec

Faster than 49% of models

0.61

seconds

Faster than 76% of models

22.28

seconds

Faster than 22% of models

Market Median

94 tok/s

1% slower

Median TTFT

1.10s

44% faster

Throughput/Dollar

466

tok/s per $/1M

Speed Comparison

MiMo-V2-Flash (Non-reasoning)
92 tok/s-0%
Reka Flash 3
93 tok/s+1%
OpenAI: GPT-5.1
94 tok/s+1%

Benchmarks

MMLU-Pro
81.1%
GPQA Diamond
69.9%
HLE
7.9%
LiveCodeBench
65.3%
SciCode
34.1%
TerminalBench Hard
4.5%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
68.7%
IFBench
31.3%
Long Context Recall
6.7%
Tau2
22.5%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models