Skip to main content
Back to Explore

DeepSeek R1 Distill Qwen 32B

DeepSeek·Released 2025-01-20
Open Source32B128K ctxMITMoE

About

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new...

Pricing

Input

$0.29

per 1M tokens

Output

$0.29

per 1M tokens

Blended

$0.29

per 1M tokens

Cheaper than 63% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.29

Monthly

$8.70

vs. Similar Models

Z.ai: GLM 4.6VQ:0.0
$0.45+55%
Qwen: Qwen3 VL 32B InstructQ:+0.1
$0.18-37%
Ministral 3 14BQ:+0.1
$0.20-31%
Qwen3 235B A22B (Non-reasoning)Q:-0.1
$0.79+171%

Performance

43

tokens/sec

Faster than 9% of models

0.45

seconds

Faster than 91% of models

47.11

seconds

Faster than 7% of models

Market Median

94 tok/s

55% slower

Median TTFT

1.11s

60% faster

Throughput/Dollar

148

tok/s per $/1M

Speed Comparison

QwQ 32B-Preview
43 tok/s+1%
Claude Opus 4.7 (Non-reasoning, High Effort)
42 tok/s-2%
OpenAI: GPT-4
42 tok/s-2%

Context Window

128K

tokens

Larger than 16% of models

Max Output

33K

tokens

26% of context

Benchmarks

MMLU-Pro
73.9%
GPQA Diamond
61.5%
HLE
5.5%
LiveCodeBench
27.0%
SciCode
37.6%
TerminalBench HardNot evaluated
MATH-500
94.1%
AIME
68.7%
AIME 2025
63.0%
IFBench
22.9%
Long Context Recall
9.7%
Tau2Not evaluated
Market AverageTop Score
mit32BGGUF / GPTQ / AWQ
Downloads

843.8K

Likes

1.6K

VRAM (FP16)

24-48 GB

GPU

A6000 / M3 Ultra

Quick Compare

Similar Models

Compare all 7 models