Skip to main content
Back to Explore

Z.ai: GLM 4.6

Z AI·Released 2025-09-30
Open Source203K ctx

About

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...

Pricing

Input

$0.43

per 1M tokens

Output

$1.74

per 1M tokens

Blended

$0.76

per 1M tokens

Cheaper than 44% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.76

Monthly

$22.73

vs. Similar Models

Z.ai: GLM 4.7 FlashQ:-0.1
$0.14-81%
OpenAI: o1Q:+0.4
$26.25+3365%
Qwen3.5 35B A3B (Non-reasoning)Q:+0.4
$0.69-9%
Claude 3.7 Sonnet (Non-reasoning)Q:+0.5
$6.00+692%

Performance

44

tokens/sec

Faster than 13% of models

1.80

seconds

Faster than 26% of models

1.80

seconds

Faster than 52% of models

Market Median

94 tok/s

53% slower

Median TTFT

1.10s

63% slower

Throughput/Dollar

58

tok/s per $/1M

Speed Comparison

Qwen: Qwen3 Max Thinking
44 tok/s-0%
Gemma 4 26B A4B (Non-reasoning)
44 tok/s-0%
Claude Opus 4.7 (Non-reasoning, High Effort)
44 tok/s-1%

Context Window

203K

tokens

Larger than 55% of models

Max Output

131K

tokens

65% of context

Benchmarks

MMLU-Pro
78.4%
GPQA Diamond
63.2%
HLE
5.2%
LiveCodeBench
56.1%
SciCode
33.1%
TerminalBench Hard
28.8%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
44.3%
IFBench
36.7%
Long Context Recall
26.3%
Tau2
76.9%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models