Loading...
Loading...
The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...
Input
$0.26
per 1M tokens
Output
$2.08
per 1M tokens
Blended
$0.72
per 1M tokens
Cheaper than 46% of models. Median price is $0.56/1M tokens.
Daily
$0.72
Monthly
$21.45
161
tokens/sec
Faster than 82% of models
1.07
seconds
Faster than 50% of models
13.47
seconds
Faster than 34% of models
Market Median
86 tok/s
88% faster
Median TTFT
1.07s
0% slower
Throughput/Dollar
226
tok/s per $/1M
Speed Comparison
Context Window
262K
tokens
Larger than 66% of models
Max Output
66K
tokens
25% of context
877.7K
537
Multi-GPU
8x A100 / H100
Quality Index
41.6
65th of 507
Top 13%
Coding Index
34.7
85th of 417
Top 21%
Price/1M
$0.72
344th cheapest
28% above median
Top 54%
Speed
161 tok/s
Top 18%
TTFT
1.07s
Context Window
262K
91st largest
Top 34%