Skip to main content
Back to Explore

Qwen: Qwen3.5-122B-A10B

Alibaba·Released 2026-02-25
Open Source122B262K ctxApache 2.0Multimodal

About

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...

Pricing

Input

$0.26

per 1M tokens

Output

$2.08

per 1M tokens

Blended

$0.72

per 1M tokens

Cheaper than 45% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.72

Monthly

$21.45

vs. Similar Models

GLM-5 (Non-reasoning)Q:+0.1
$1.55+117%
OpenAI: o3 ProQ:+0.2
$35.00+4795%
Qwen3.5 397B A17B (Non-reasoning)Q:-0.3
$1.35+89%
MoonshotAI: Kimi K2 ThinkingQ:+0.4
$1.07+50%

Performance

145

tokens/sec

Faster than 71% of models

1.13

seconds

Faster than 49% of models

14.96

seconds

Faster than 32% of models

Market Median

94 tok/s

53% faster

Median TTFT

1.11s

1% slower

Throughput/Dollar

202

tok/s per $/1M

Speed Comparison

NVIDIA Nemotron Nano 9B V2 (Non-reasoning)
145 tok/s+0%
Grok 4.3 (medium)
144 tok/s-0%
OpenAI: o3
145 tok/s+1%

Context Window

262K

tokens

Larger than 62% of models

Max Output

262K

tokens

100% of context

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
85.7%
HLE
23.4%
LiveCodeBenchNot evaluated
SciCode
42.0%
TerminalBench Hard
31.1%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
75.7%
Long Context Recall
66.7%
Tau2
93.6%
Market AverageTop Score
apache-2.0122BGGUF / GPTQ / AWQ
Downloads

779.3K

Likes

578

VRAM (FP16)

Multi-GPU

GPU

8x A100 / H100

Quick Compare

Similar Models

Compare all 7 models