Skip to main content
Back to Explore

Meta: Llama 4 Scout

Meta·Released 2025-04-05
Open Source10.0M ctxMultimodal

About

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

Pricing

Input

$0.10

per 1M tokens

Output

$0.30

per 1M tokens

Blended

$0.15

per 1M tokens

Cheaper than 75% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.15

Monthly

$4.50

vs. Similar Models

Qwen: Qwen3 VL 30B A3B InstructQ:0.0
$0.23+52%
Hermes 4 - Llama-3.1 70B (Reasoning)Q:0.0
$0.20+32%
Qwen3 14B (Reasoning)Q:+0.1
$0.73+387%
Claude 3.5 Sonnet (Oct '24)Q:-0.1
$6.00+3900%

Performance

109

tokens/sec

Faster than 58% of models

0.61

seconds

Faster than 76% of models

0.61

seconds

Faster than 83% of models

Market Median

94 tok/s

16% faster

Median TTFT

1.11s

45% faster

Throughput/Dollar

730

tok/s per $/1M

Speed Comparison

Llama 3.2 Instruct 11B (Vision)
110 tok/s+0%
GPT-5.4 (low)
109 tok/s-0%
Kwaipilot: KAT-Coder-Pro V2
109 tok/s-0%

Context Window

10.0M

tokens

Larger than 100% of models

Max Output

16K

tokens

0% of context

Benchmarks

MMLU-Pro
75.2%
GPQA Diamond
58.7%
HLE
4.3%
LiveCodeBench
29.9%
SciCode
17.0%
TerminalBench Hard
1.5%
MATH-500
84.4%
AIME
28.3%
AIME 2025
14.0%
IFBench
39.5%
Long Context Recall
25.8%
Tau2
15.5%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models