About
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...
Pricing
Input
$0.80
per 1M tokens
Output
$0.80
per 1M tokens
Blended
$0.80
per 1M tokens
Cheaper than 43% of models. Median price is $0.54/1M tokens.
Cost Calculator
Daily
$0.80
Monthly
$24.00
vs. Similar Models
Performance
62
tokens/sec
Faster than 31% of models
0.43
seconds
Faster than 93% of models
32.55
seconds
Faster than 15% of models
Market Median
94 tok/s
34% slower
Median TTFT
1.10s
61% faster
Throughput/Dollar
78
tok/s per $/1M
Speed Comparison
Context Window
128K
tokens
Larger than 16% of models
Max Output
8K
tokens
6% of context