Skip to main content
Back to Explore

Z.ai: GLM 4.5V

Z AI·Released 2025-08-11
Open Source66K ctxMultimodal

About

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding,...

Pricing

Input

$0.60

per 1M tokens

Output

$1.80

per 1M tokens

Blended

$0.90

per 1M tokens

Cheaper than 38% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.90

Monthly

$27.00

vs. Similar Models

Qwen3 14B (Non-reasoning)Q:0.0
$0.38-58%
OpenAI: GPT-4Q:0.0
$37.50+4067%
Google: Gemini 2.5 Flash LiteQ:-0.1
$0.17-81%
Hermes 4 - Llama-3.1 70B (Non-reasoning)Q:-0.1
$0.20-78%

Performance

50

tokens/sec

Faster than 17% of models

12.35

seconds

Faster than 9% of models

12.35

seconds

Faster than 37% of models

Market Median

94 tok/s

46% slower

Median TTFT

1.10s

1017% slower

Throughput/Dollar

56

tok/s per $/1M

Speed Comparison

GLM-4.6V (Reasoning)
50 tok/s+0%
Gemma 3n E4B Instruct
50 tok/s-0%
GLM-4.6 (Reasoning)
50 tok/s-0%

Context Window

66K

tokens

Larger than 13% of models

Max Output

16K

tokens

25% of context

Benchmarks

MMLU-Pro
75.1%
GPQA Diamond
57.3%
HLE
3.6%
LiveCodeBench
35.2%
SciCode
18.8%
TerminalBench Hard
6.8%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
15.3%
IFBench
28.6%
Long Context Recall
0.0%
Tau2
19.6%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models