Skip to main content
Back to Explore

Z.ai: GLM 4.5 Air

Z AI·Released 2025-07-25
Open Source131K ctx

About

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...

Pricing

Input

$0.13

per 1M tokens

Output

$0.85

per 1M tokens

Blended

$0.31

per 1M tokens

Cheaper than 61% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.31

Monthly

$9.30

vs. Similar Models

Grok 4 Fast (Non-reasoning)Q:0.0
$0.28-11%
GPT-5.4 mini (Non-Reasoning)Q:+0.1
$1.69+445%
Nova 2.0 Omni (low)Q:+0.1
$0.85+174%
OpenAI: GPT-4.1 MiniQ:-0.2
$0.70+126%

Performance

75

tokens/sec

Faster than 37% of models

1.49

seconds

Faster than 34% of models

28.06

seconds

Faster than 17% of models

Market Median

94 tok/s

20% slower

Median TTFT

1.10s

35% slower

Throughput/Dollar

243

tok/s per $/1M

Speed Comparison

NVIDIA Nemotron Nano 9B V2 (Reasoning)
75 tok/s+0%
MiniMax: MiniMax M3
75 tok/s+0%
DeepSeek: DeepSeek V4 Pro
75 tok/s+0%

Context Window

131K

tokens

Larger than 27% of models

Max Output

98K

tokens

75% of context

Benchmarks

MMLU-Pro
81.5%
GPQA Diamond
73.3%
HLE
6.8%
LiveCodeBench
68.4%
SciCode
30.6%
TerminalBench Hard
20.5%
MATH-500
96.5%
AIME
67.3%
AIME 2025
80.7%
IFBench
37.6%
Long Context Recall
43.7%
Tau2
46.5%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models

Used by Agents