Skip to main content
Back to Explore

Qwen: Qwen3 VL 32B Instruct

Alibaba·Released 2025-10-23
Open Source32B262K ctxApache 2.0Multimodal

About

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...

Pricing

Input

$0.10

per 1M tokens

Output

$0.42

per 1M tokens

Blended

$0.18

per 1M tokens

Cheaper than 71% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.18

Monthly

$5.46

vs. Similar Models

Ministral 3 14BQ:0.0
$0.20+10%
OpenAI: GPT-4oQ:+0.1
$4.38+2304%
DeepSeek R1 Distill Qwen 32BQ:-0.1
$0.29+59%
Z.ai: GLM 4.6VQ:-0.1
$0.45+147%

Performance

72

tokens/sec

Faster than 37% of models

1.10

seconds

Faster than 52% of models

1.10

seconds

Faster than 66% of models

Market Median

94 tok/s

24% slower

Median TTFT

1.11s

1% faster

Throughput/Dollar

395

tok/s per $/1M

Speed Comparison

Nova Premier
72 tok/s-0%
OpenAI: GPT-4o-mini
73 tok/s+1%
NVIDIA Nemotron Nano 9B V2 (Reasoning)
71 tok/s-1%

Context Window

262K

tokens

Larger than 62% of models

Max Output

33K

tokens

13% of context

Benchmarks

MMLU-Pro
79.1%
GPQA Diamond
67.1%
HLE
6.3%
LiveCodeBench
51.4%
SciCode
30.1%
TerminalBench Hard
8.3%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
68.3%
IFBench
39.2%
Long Context Recall
31.3%
Tau2
29.2%
Market AverageTop Score
apache-2.032BGGUF / GPTQ / AWQ
Downloads

5.4M

Likes

207

VRAM (FP16)

24-48 GB

GPU

A6000 / M3 Ultra

Quick Compare

Similar Models

Compare all 7 models