Skip to main content
Back to Explore

Google: Gemini 2.5 Flash Lite

Google·Released 2025-07-22
1.0M ctxMoEMultimodal

About

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Pricing

Input

$0.10

per 1M tokens

Output

$0.40

per 1M tokens

Blended

$0.17

per 1M tokens

Cheaper than 72% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.17

Monthly

$5.25

vs. Similar Models

Hermes 4 - Llama-3.1 70B (Non-reasoning)Q:0.0
$0.20+13%
Mistral Small 3Q:0.0
$0.10-41%
Nova LiteQ:0.0
$0.10-40%
OpenAI: GPT-4o-miniQ:0.0
$0.26+50%

Performance

211

tokens/sec

Faster than 88% of models

0.31

seconds

Faster than 97% of models

0.31

seconds

Faster than 99% of models

Market Median

94 tok/s

123% faster

Median TTFT

1.11s

73% faster

Throughput/Dollar

1203

tok/s per $/1M

Speed Comparison

Google: Gemini 3.5 Flash
210 tok/s-0%
Gemini 3.5 Flash (medium)
211 tok/s+0%
Arcee AI: Trinity Large Thinking
211 tok/s+0%

Context Window

1.0M

tokens

Larger than 90% of models

Max Output

66K

tokens

6% of context

Benchmarks

MMLU-Pro
72.4%
GPQA Diamond
47.4%
HLE
3.7%
LiveCodeBench
40.0%
SciCode
17.7%
TerminalBench Hard
2.3%
MATH-500
92.6%
AIME
50.0%
AIME 2025
35.3%
IFBench
31.5%
Long Context Recall
31.3%
Tau2
19.0%
Market AverageTop Score

Quick Compare

Similar Models

Compare all 7 models

Used by Agents