Meta: Llama 4 Maverick

Meta·Released 2025-04-05

Open Source1.0M ctxMultimodal

About

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

Quality Index

14.3

258th of 537

Top 49%

Coding Index

16.3

264th of 447

Top 59%

Math Index

19.3

211th of 269

Top 78%

Price/1M

$0.26

240th cheapest

52% below median

Top 36%

Speed

100 tok/s

Top 47%

TTFT

0.67s

Context Window

1.0M

17th largest

Top 10%

Market Position

Meta: Llama 4 MaverickMarket Average

Pricing

Input

$0.15

per 1M tokens

Output

$0.60

per 1M tokens

Blended

$0.26

per 1M tokens

Cheaper than 64% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M

100K100M

Daily

$0.26

Monthly

$7.87

vs. Similar Models

Qwen: Qwen3 VL 235B A22B InstructQ:0.0

$0.37+41%

GPT-5 mini (minimal)Q:0.0

$0.69+162%

gpt-oss-20B (low)Q:0.0

$0.10-64%

Nova 2.0 Pro Preview (Non-reasoning)Q:+0.1

$3.44+1210%

Performance

100

tokens/sec

Faster than 53% of models

0.67

seconds

Faster than 73% of models

0.67

seconds

Faster than 81% of models

Market Median

94 tok/s

6% faster

Median TTFT

1.11s

40% faster

Throughput/Dollar

383

tok/s per $/1M

Speed Comparison

GPT-5 mini (minimal)

101 tok/s+1%

OpenAI: GPT-5.3-Codex

100 tok/s-1%

GPT-5.1 (Non-reasoning)

101 tok/s+1%

Context Window

1.0M

tokens

Larger than 90% of models

Max Output

16K

tokens

2% of context

Benchmarks

MMLU-Pro

80.9%

GPQA Diamond

67.1%

HLE

4.8%

LiveCodeBench

39.7%

SciCode

33.1%

TerminalBench Hard

6.8%

MATH-500

88.9%

AIME

39.0%

AIME 2025

19.3%

IFBench

43.0%

Long Context Recall

46.0%

Tau2

17.8%

Market AverageTop Score

Open Source

Quick Compare

Similar Models

Qwen: Qwen3 VL 235B A22B Instruct

Alibaba

Q: 14.3$0.37/1M262K ctx

Slower: 50%Pricier: 41%

gpt-oss-20B (low)

OpenAI

Q: 14.3$0.10/1M

Faster: 163%Cheaper: 64%

GPT-5 mini (minimal)

OpenAI

Q: 14.3$0.69/1M

Pricier: 162%Coding: +5.6

Nova 2.0 Pro Preview (Non-reasoning)

Amazon

Q: 14.4$3.44/1M

Faster: 61%Pricier: 1210%

MiniMax M1 40k

MiniMax

Q: 14.4N/A/1M

NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)

NVIDIA

Q: 14.2$0.10/1M

Slower: 32%Cheaper: 63%

Compare all 7 models

Meta: Llama 4 Maverick

About

Related Models

Market Position

Pricing

Cost Calculator

vs. Similar Models

Performance

Benchmarks

Open Source

Quick Compare

Similar Models

Market Position