Skip to main content
Back to Explore

Z.ai: GLM 4.6V

Z AI·Released 2025-12-08
Open Source131K ctxMultimodal

About

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts...

Pricing

Input

$0.30

per 1M tokens

Output

$0.90

per 1M tokens

Blended

$0.45

per 1M tokens

Cheaper than 53% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.45

Monthly

$13.50

vs. Similar Models

DeepSeek R1 Distill Qwen 32BQ:0.0
$0.29-36%
Qwen: Qwen3 VL 32B InstructQ:+0.1
$0.18-60%
Ministral 3 14BQ:+0.1
$0.20-56%
Qwen3 235B A22B (Non-reasoning)Q:-0.1
$0.79+75%

Performance

47

tokens/sec

Faster than 14% of models

1.14

seconds

Faster than 47% of models

1.14

seconds

Faster than 64% of models

Market Median

94 tok/s

50% slower

Median TTFT

1.10s

3% slower

Throughput/Dollar

104

tok/s per $/1M

Speed Comparison

Grok 4
47 tok/s-0%
Claude 4 Sonnet (Non-reasoning)
47 tok/s+0%
Llama 3 Instruct 70B
47 tok/s-1%

Context Window

131K

tokens

Larger than 27% of models

Max Output

33K

tokens

25% of context

Benchmarks

MMLU-Pro
75.2%
GPQA Diamond
56.6%
HLE
3.7%
LiveCodeBench
41.1%
SciCode
27.2%
TerminalBench Hard
3.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
26.3%
IFBench
27.9%
Long Context Recall
12.3%
Tau2
30.7%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models