Meta: Llama 4 Maverick

Meta·Released 2025-04-05

Open Source1.0M ctxMultimodal

About

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

Quality Index

18.4

271st of 507

Top 53%

Coding Index

15.6

241st of 417

Top 58%

Math Index

19.3

211th of 269

Top 78%

Price/1M

$0.26

218th cheapest

53% below median

Top 35%

Speed

112 tok/s

Top 37%

TTFT

0.64s

Context Window

1.0M

15th largest

Top 9%

Market Position

Meta: Llama 4 MaverickMarket Average

Pricing

Input

$0.15

per 1M tokens

Output

$0.60

per 1M tokens

Blended

$0.26

per 1M tokens

Cheaper than 65% of models. Median price is $0.56/1M tokens.

Cost Calculator

Tokens per day1M

100K100M

Daily

$0.26

Monthly

$7.87

vs. Similar Models

Gemini 2.0 Flash (Feb '25)Q:+0.1

$0.26-0%

Magistral Small 1.2Q:-0.2

$0.75+186%

GPT-4o (Aug '24)Q:+0.2

$4.38+1567%

Hermes 4 - Llama-3.1 405B (Reasoning)Q:+0.2

$1.50+471%

Performance

112

tokens/sec

Faster than 63% of models

0.64

seconds

Faster than 71% of models

0.64

seconds

Faster than 80% of models

Market Median

86 tok/s

31% faster

Median TTFT

1.07s

40% faster

Throughput/Dollar

428

tok/s per $/1M

Speed Comparison

Magistral Small 1.2

113 tok/s+1%

OpenAI: GPT-4.1 Nano

113 tok/s+1%

Qwen: Qwen3 Coder 30B A3B Instruct

113 tok/s+1%

Context Window

1.0M

tokens

Larger than 91% of models

Max Output

16K

tokens

2% of context

Benchmarks

MMLU-Pro

80.9%

GPQA Diamond

67.1%

HLE

4.8%

LiveCodeBench

39.7%

SciCode

33.1%

TerminalBench Hard

6.8%

MATH-500

88.9%

AIME

39.0%

AIME 2025

19.3%

IFBench

43.0%

Long Context Recall

46.0%

Tau2

17.8%

Market AverageTop Score

Open Source

Quick Compare

Similar Models

Llama 3.3 Nemotron Super 49B v1 (Reasoning)

NVIDIA

Q: 18.5N/A/1M

Coding: -6.2

Gemini 2.0 Flash (Feb '25)

Google

Q: 18.5$0.26/1M

Magistral Small 1.2

Mistral

Q: 18.2$0.75/1M

Pricier: 186%

Sarvam 105B (high)

Sarvam

Q: 18.2N/A/1M

Coding: -5.8

Qwen3 4B 2507 (Reasoning)

Alibaba

Q: 18.2N/A/1M

Coding: -6.1

GPT-4o (Aug '24)

OpenAI

Q: 18.6$4.38/1M128K ctx

Pricier: 1567%Context Window: 8x smaller

Compare all 7 models

Quality Index

18.4

271st of 507

Top 53%

Coding Index

15.6

241st of 417

Top 58%

Math Index

19.3

211th of 269

Top 78%

Price/1M

$0.26

218th cheapest

53% below median

Top 35%

Speed

112 tok/s

Top 37%

TTFT

0.64s

Context Window

1.0M

15th largest

Top 9%

Market Position

Meta: Llama 4 MaverickMarket Average

Meta: Llama 4 Maverick

About

Related Models

Market Position

Pricing

Cost Calculator

vs. Similar Models

Performance

Benchmarks

Open Source

Quick Compare

Similar Models

Market Position