Loading...
Loading...
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new...
Input
$0.29
per 1M tokens
Output
$0.29
per 1M tokens
Blended
$0.29
per 1M tokens
Cheaper than 63% of models. Median price is $0.56/1M tokens.
Daily
$0.29
Monthly
$8.70
43
tokens/sec
Faster than 12% of models
0.45
seconds
Faster than 88% of models
47.11
seconds
Faster than 9% of models
Market Median
86 tok/s
50% slower
Median TTFT
1.07s
58% faster
Throughput/Dollar
148
tok/s per $/1M
Speed Comparison
Context Window
33K
tokens
Larger than 8% of models
Max Output
33K
tokens
100% of context
932.0K
1.6K
24-48 GB
A6000 / M3 Ultra
Quality Index
17.2
285th of 507
Top 56%
Math Index
63.0
111th of 269
Top 41%
Price/1M
$0.29
234th cheapest
48% below median
Top 37%
Speed
43 tok/s
Top 88%
TTFT
0.45s
Context Window
33K
353rd largest
Top 92%