Skip to main content
Back to Explore

Qwen: Qwen3 Next 80B A3B Instruct

Alibaba·Released 2025-09-11
Open Source80B262K ctxApache 2.0

About

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...

Pricing

Input

$0.09

per 1M tokens

Output

$1.10

per 1M tokens

Blended

$0.34

per 1M tokens

Cheaper than 60% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.34

Monthly

$10.28

vs. Similar Models

Qwen: Qwen3 Coder 30B A3B InstructQ:-0.1
$0.12-65%
QwQ 32BQ:-0.3
$0.74+118%
Qwen3 235B A22B (Reasoning)Q:-0.3
$2.63+666%
Qwen3 VL 30B A3B (Reasoning)Q:-0.4
$0.34-1%

Performance

189

tokens/sec

Faster than 85% of models

1.11

seconds

Faster than 50% of models

1.11

seconds

Faster than 66% of models

Market Median

94 tok/s

100% faster

Median TTFT

1.11s

1% faster

Throughput/Dollar

551

tok/s per $/1M

Speed Comparison

Gemini 3 Flash Preview (Non-reasoning)
188 tok/s-0%
Step 3.5 Flash
189 tok/s+0%
Nova 2.0 Lite (high)
188 tok/s-0%

Context Window

262K

tokens

Larger than 62% of models

Max Output

16K

tokens

6% of context

Benchmarks

MMLU-Pro
81.9%
GPQA Diamond
73.8%
HLE
7.3%
LiveCodeBench
68.4%
SciCode
30.7%
TerminalBench Hard
7.6%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
66.3%
IFBench
39.7%
Long Context Recall
51.3%
Tau2
21.6%
Market AverageTop Score
apache-2.080BGGUF / GPTQ / AWQ
Downloads

777.1K

Likes

951

VRAM (FP16)

Multi-GPU

GPU

8x A100 / H100

Quick Compare

Similar Models

Compare all 7 models