Skip to main content
Back to Explore

Qwen: Qwen3.5 397B A17B

Alibaba·Released 2026-02-16
Open Source397B256K ctxApache 2.0Multimodal

About

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...

Pricing

Input

$0.39

per 1M tokens

Output

$2.45

per 1M tokens

Blended

$0.90

per 1M tokens

Cheaper than 37% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.90

Monthly

$27.04

vs. Similar Models

MiniMax: MiniMax M2.5Q:0.0
$0.21-77%
Claude 4.1 Opus (Reasoning)Q:0.0
$30.00+3229%
GPT-5 (medium)Q:0.0
$3.44+281%
Qwen: Qwen3.5-27BQ:+0.1
$0.54-40%

Performance

52

tokens/sec

Faster than 18% of models

1.68

seconds

Faster than 30% of models

63.15

seconds

Faster than 4% of models

Market Median

94 tok/s

45% slower

Median TTFT

1.11s

51% slower

Throughput/Dollar

58

tok/s per $/1M

Speed Comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
52 tok/s-0%
Claude 4.5 Sonnet (Reasoning)
52 tok/s-0%
Ling-flash-2.0
52 tok/s+0%

Context Window

256K

tokens

Larger than 58% of models

Max Output

64K

tokens

25% of context

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
89.3%
HLE
27.3%
LiveCodeBenchNot evaluated
SciCode
42.0%
TerminalBench Hard
40.9%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
78.8%
Long Context Recall
65.7%
Tau2
95.6%
Market AverageTop Score
apache-2.0397BGGUF / GPTQ / AWQ
Downloads

590.0K

Likes

1.5K

VRAM (FP16)

Multi-GPU

GPU

8x A100 / H100

Quick Compare

Similar Models

Compare all 7 models