Loading...
Loading...
The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers state-of-the-art performance comparable to leading-edge models across a wide range of tasks, including language understanding, logical reasoning, code generation, agent-based tasks, image understanding, video understanding, and graphical user interface (GUI) interactions. With its robust code-generation and agent capabilities, the model exhibits strong generalization across diverse agent.
Quality Index
45.0
21st of 444
Top 5%
Coding Index
41.3
24th of 354
Top 7%
Price/1M
$1.35
519th cheapest
350% above median
Top 76%
Speed
54 tok/s
Top 45%
TTFT
1.46s
Context Window
262K
61st largest
Top 25%
Input
$0.60
per 1M tokens
Output
$3.60
per 1M tokens
Blended
$1.35
per 1M tokens
Cheaper than 24% of models. Median price is $0.30/1M tokens.
Daily
$1.35
Monthly
$40.50
54
tokens/sec
Faster than 55% of models
1.46
seconds
Faster than 19% of models
38.45
seconds
Faster than 6% of models
Market Median
45 tok/s
19% faster
Median TTFT
0.42s
249% slower
Throughput/Dollar
40
tok/s per $/1M
Speed Comparison
Context Window
262K
tokens
Larger than 75% of models
Max Output
66K
tokens
25% of context
1.7M
1.4K
Multi-GPU
8x A100 / H100