Skip to main content
Back to Explore

Z.ai: GLM 4.7 Flash

Z AI·Released 2026-01-19
Open Source203K ctxMIT

About

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

Pricing

Input

$0.06

per 1M tokens

Output

$0.40

per 1M tokens

Blended

$0.14

per 1M tokens

Cheaper than 77% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.14

Monthly

$4.35

vs. Similar Models

Z.ai: GLM 4.6Q:+0.1
$0.76+422%
Grok 3 mini Reasoning (high)Q:-0.4
$0.35+141%
Grok 4.20 0309 (Non-reasoning)Q:-0.4
$3.00+1969%
OpenAI: o1Q:+0.5
$26.25+18003%

Performance

105

tokens/sec

Faster than 56% of models

0.92

seconds

Faster than 60% of models

20.05

seconds

Faster than 25% of models

Market Median

94 tok/s

11% faster

Median TTFT

1.11s

17% faster

Throughput/Dollar

721

tok/s per $/1M

Speed Comparison

Qwen3 4B (Non-reasoning)
104 tok/s-0%
Qwen3 4B (Reasoning)
104 tok/s-0%
Qwen3 Omni 30B A3B Instruct
105 tok/s+1%

Context Window

203K

tokens

Larger than 55% of models

Max Output

16K

tokens

8% of context

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
58.1%
HLE
7.1%
LiveCodeBenchNot evaluated
SciCode
33.7%
TerminalBench Hard
22.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
60.8%
Long Context Recall
35.0%
Tau2
98.8%
Market AverageTop Score
mit
Downloads

2.3M

Likes

1.8K

Quick Compare

Similar Models

Compare all 7 models