Skip to main content
Back to Explore

Google: Gemini 2.5 Flash Lite

Google·Released 2025-07-22
1.0M ctxMoEMultimodal

About

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Pricing

Input

$0.10

per 1M tokens

Output

$0.40

per 1M tokens

Blended

$0.17

per 1M tokens

Cheaper than 72% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.17

Monthly

$5.25

vs. Similar Models

Hermes 4 - Llama-3.1 70B (Non-reasoning)Q:0.0
$0.20+13%
Mistral Small 3Q:0.0
$0.10-41%
Nova LiteQ:0.0
$0.10-40%
OpenAI: GPT-4o-miniQ:0.0
$0.26+50%

Performance

227

tokens/sec

Faster than 92% of models

0.32

seconds

Faster than 97% of models

0.32

seconds

Faster than 99% of models

Market Median

94 tok/s

142% faster

Median TTFT

1.10s

71% faster

Throughput/Dollar

1296

tok/s per $/1M

Speed Comparison

Google: Gemini 2.5 Flash
226 tok/s-0%
Qwen3 0.6B (Reasoning)
224 tok/s-1%
Gemini 2.5 Flash (Reasoning)
224 tok/s-1%

Context Window

1.0M

tokens

Larger than 90% of models

Max Output

66K

tokens

6% of context

Benchmarks

MMLU-Pro
72.4%
GPQA Diamond
47.4%
HLE
3.7%
LiveCodeBench
40.0%
SciCode
17.7%
TerminalBench Hard
2.3%
MATH-500
92.6%
AIME
50.0%
AIME 2025
35.3%
IFBench
31.5%
Long Context Recall
31.3%
Tau2
19.0%
Market AverageTop Score

Quick Compare

Similar Models

Compare all 7 models

Used by Agents