Skip to main content
Back to Explore

Qwen: Qwen3.5-35B-A3B

Alibaba·Released 2026-02-25
Open Source35B262K ctxApache 2.0Multimodal

About

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...

Pricing

Input

$0.14

per 1M tokens

Output

$1.00

per 1M tokens

Blended

$0.35

per 1M tokens

Cheaper than 58% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.35

Monthly

$10.65

vs. Similar Models

Claude 4.5 Sonnet (Non-reasoning)Q:0.0
$6.00+1590%
Qwen3.5 27B (Non-reasoning)Q:0.0
$0.88+146%
Qwen3.6 27B (Non-reasoning)Q:0.0
$1.35+280%
Gemma 4 31B (Reasoning)Q:+0.1
$0.20-42%

Performance

163

tokens/sec

Faster than 79% of models

1.15

seconds

Faster than 47% of models

13.40

seconds

Faster than 35% of models

Market Median

94 tok/s

73% faster

Median TTFT

1.11s

3% slower

Throughput/Dollar

460

tok/s per $/1M

Speed Comparison

GPT-5.4 nano (Non-Reasoning)
163 tok/s-0%
GPT-5.4 mini (Non-Reasoning)
162 tok/s-0%
OpenAI: GPT-5.4 Nano
162 tok/s-1%

Context Window

262K

tokens

Larger than 62% of models

Max Output

82K

tokens

31% of context

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
84.5%
HLE
19.7%
LiveCodeBenchNot evaluated
SciCode
37.7%
TerminalBench Hard
26.5%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
72.5%
Long Context Recall
62.7%
Tau2
89.2%
Market AverageTop Score
apache-2.035BGGUF / GPTQ / AWQ
Downloads

2.0M

Likes

1.4K

VRAM (FP16)

48-80 GB

GPU

A100 80GB

Quick Compare

Similar Models

Compare all 7 models