Skip to main content
Back to Explore

Google: Gemini 2.5 Flash Lite Preview 09-2025

Google·Released 2025-09-25
1.0M ctxMoEMultimodal

About

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Pricing

Input

$0.10

per 1M tokens

Output

$0.40

per 1M tokens

Blended

$0.17

per 1M tokens

Cheaper than 72% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.17

Monthly

$5.25

vs. Similar Models

Gemma 4 12B (Non-reasoning)Q:+0.1
$0.15-14%
Qwen3 VL 30B A3B (Reasoning)Q:+0.2
$0.34+93%
QwQ 32BQ:+0.3
$0.74+326%
Qwen3 235B A22B (Reasoning)Q:+0.3
$2.63+1400%

Performance

353

tokens/sec

Faster than 98% of models

0.43

seconds

Faster than 93% of models

0.43

seconds

Faster than 95% of models

Market Median

95 tok/s

274% faster

Median TTFT

1.11s

61% faster

Throughput/Dollar

2020

tok/s per $/1M

Speed Comparison

Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)
347 tok/s-2%
gpt-oss-120b (low)
340 tok/s-4%
LFM2 2.6B
332 tok/s-6%

Context Window

1.0M

tokens

Larger than 90% of models

Max Output

66K

tokens

6% of context

Benchmarks

MMLU-Pro
79.6%
GPQA Diamond
65.1%
HLE
4.6%
LiveCodeBench
64.1%
SciCode
28.5%
TerminalBench Hard
7.6%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
46.7%
IFBench
41.8%
Long Context Recall
48.0%
Tau2
30.4%
Market AverageTop Score

Quick Compare

Similar Models

Compare all 7 models

Used by Agents