Qwen: Qwen3.5-9B

Alibaba·Released 2026-03-10

Open Source9B262K ctxApache 2.0Multimodal

About

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

Quality Index

25.0

147th of 537

Top 28%

Coding Index

28.7

161st of 447

Top 36%

Price/1M

$0.11

123rd cheapest

79% below median

Top 18%

Speed

68 tok/s

Top 66%

TTFT

0.88s

Context Window

262K

110th largest

Top 38%

Market Position

Qwen: Qwen3.5-9BMarket Average

Pricing

Input

$0.10

per 1M tokens

Output

$0.15

per 1M tokens

Blended

$0.11

per 1M tokens

Cheaper than 82% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M

100K100M

Daily

$0.11

Monthly

$3.37

vs. Similar Models

Google: Gemini 3.1 Flash Lite PreviewQ:0.0

$0.56+400%

Qwen3 Max Thinking (Preview)Q:0.0

$2.40+2033%

GLM-4.6 (Reasoning)Q:+0.1

$0.96+756%

Gemma 4 31B (Non-reasoning)Q:-0.2

$0.20+82%

Performance

tokens/sec

Faster than 34% of models

0.88

seconds

Faster than 61% of models

30.46

seconds

Faster than 16% of models

Market Median

94 tok/s

28% slower

Median TTFT

1.11s

21% faster

Throughput/Dollar

601

tok/s per $/1M

Speed Comparison

DeepSeek: R1 Distill Llama 70B

67 tok/s-1%

NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)

68 tok/s+1%

GPT-5.2 (Non-reasoning)

67 tok/s-1%

Context Window

262K

tokens

Larger than 62% of models

Max Output

262K

tokens

100% of context

Benchmarks

MMLU-ProNot evaluated

GPQA Diamond

80.6%

HLE

13.3%

LiveCodeBenchNot evaluated

SciCode

27.5%

TerminalBench Hard

24.2%

MATH-500Not evaluated

AIMENot evaluated

AIME 2025Not evaluated

IFBench

66.7%

Long Context Recall

59.0%

Tau2

86.8%

Market AverageTop Score

Open Source

View model repository

apache-2.09BGGUF / GPTQ / AWQ

Downloads

9.2M

Likes

1.6K

VRAM (FP16)

16-24 GB

GPU

RTX 4090 / M2 Max

Quick Compare

Similar Models

Google: Gemini 3.1 Flash Lite Preview

Google

Q: 25.0$0.56/1M1.0M ctx

Faster: 386%Pricier: 400%

Qwen3 Max Thinking (Preview)

Alibaba

Q: 25.0$2.40/1M

Slower: 22%Pricier: 2033%

GLM-4.6 (Reasoning)

Z AI

Q: 25.1$0.96/1M

Slower: 19%Pricier: 756%

Gemma 4 31B (Non-reasoning)

Google

Q: 24.8$0.20/1M

Slower: 16%Pricier: 82%

Grok 4.3 (Non-reasoning)

xAI

Q: 24.8$1.56/1M

Faster: 76%Pricier: 1289%

Inception: Mercury 2

Inception

Q: 25.3$0.38/1M128K ctx

Faster: 1458%Pricier: 233%

Compare all 7 models

Qwen: Qwen3.5-9B

About

Related Models

Market Position

Pricing

Cost Calculator

vs. Similar Models

Performance

Benchmarks

Open Source

Quick Compare

Similar Models

Market Position