Loading...
Loading...
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual use, while remaining robust on alignment and formatting. Compared with prior Qwen3 instruct variants, it focuses on higher throughput and stability on ultra-long inputs and multi-turn dialogues, making it well-suited for RAG, tool use, and agentic workflows that require consistent final answers rather than visible chain-of-thought. The model employs scaling-efficient training and decoding to improve parameter efficiency and inference speed, and has been validated on a broad set of public benchmarks where it reaches or approaches larger Qwen3 systems in several categories while outperforming earlier mid-sized baselines. It is best used as a general assistant, code helper, and long-context task solver in production settings where deterministic, instruction-following outputs are preferred.
Quality Index
20.1
186th of 444
Top 42%
Coding Index
15.3
192nd of 354
Top 54%
Math Index
66.3
103rd of 268
Top 39%
Price/1M
$0.88
470th cheapest
192% above median
Top 69%
Speed
142 tok/s
Top 14%
TTFT
0.96s
Context Window
262K
61st largest
Top 25%
Input
$0.50
per 1M tokens
Output
$2.00
per 1M tokens
Blended
$0.88
per 1M tokens
Cheaper than 31% of models. Median price is $0.30/1M tokens.
Daily
$0.88
Monthly
$26.25
142
tokens/sec
Faster than 86% of models
0.96
seconds
Faster than 32% of models
0.96
seconds
Faster than 39% of models
Market Median
45 tok/s
214% faster
Median TTFT
0.42s
129% slower
Throughput/Dollar
163
tok/s per $/1M
Speed Comparison
Context Window
262K
tokens
Larger than 75% of models
777.1K
951
Multi-GPU
8x A100 / H100