DeepSeek: R1 Distill Llama 70B — DeepSeek | FindLLM

DeepSeek: R1 Distill Llama 70B

DeepSeek·Released 2025-01-23

Open Source131K ctxMoE

About

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...

Quality Index

16.0

302nd of 507

Top 60%

Coding Index

11.4

300th of 417

Top 72%

Math Index

53.7

133rd of 269

Top 50%

Price/1M

$0.72

346th cheapest

29% above median

Top 54%

Speed

44 tok/s

Top 85%

TTFT

0.38s

Context Window

131K

201st largest

Top 67%

Market Position

DeepSeek: R1 Distill Llama 70BMarket Average

Pricing

Input

$0.70

per 1M tokens

Output

$0.80

per 1M tokens

Blended

$0.72

per 1M tokens

Cheaper than 46% of models. Median price is $0.56/1M tokens.

Cost Calculator

Tokens per day1M

100K100M

Daily

$0.72

Monthly

$21.75

vs. Similar Models

Hermes 4 - Llama-3.1 70B (Reasoning)Q:0.0

$0.20-73%

Ministral 3 14BQ:0.0

$0.20-72%

Claude 3.5 Sonnet (Oct '24)Q:-0.1

$6.56+805%

Qwen: Qwen3 VL 30B A3B InstructQ:+0.1

$0.23-69%

Performance

tokens/sec

Faster than 15% of models

0.38

seconds

Faster than 92% of models

45.45

seconds

Faster than 11% of models

Market Median

86 tok/s

48% slower

Median TTFT

1.07s

65% faster

Throughput/Dollar

tok/s per $/1M

Speed Comparison

Microsoft: Phi 4 Mini Instruct

45 tok/s+0%

Qwen3 Max Thinking (Preview)

45 tok/s+1%

Qwen: Qwen3 Max Thinking

44 tok/s-2%

Context Window

131K

tokens

Larger than 33% of models

Max Output

16K

tokens

13% of context

Benchmarks

MMLU-Pro

79.5%

GPQA Diamond

40.2%

HLE

6.1%

LiveCodeBench

26.6%

SciCode

31.2%

TerminalBench Hard

1.5%

MATH-500

93.5%

AIME

67.0%

AIME 2025

53.7%

IFBench

27.6%

Long Context Recall

11.0%

Tau2

21.9%

Market AverageTop Score

Open Source

Quick Compare

Similar Models

Ministral 3 14B

Mistral

Q: 16.0$0.20/1M

Faster: 229%Cheaper: 72%

Hermes 4 - Llama-3.1 70B (Reasoning)

Nous Research

Q: 16.0$0.20/1M

Faster: 76%Cheaper: 73%

Gemini 1.5 Pro (Sep '24)

Google

Q: 16.0N/A/1M

Coding: +12.2

Solar Pro 2 (Preview) (Non-reasoning)

Upstage

Q: 16.0N/A/1M

Claude 3.5 Sonnet (Oct '24)

Anthropic

Q: 15.9$6.56/1M

Pricier: 805%Coding: +18.8

Qwen: Qwen3 VL 30B A3B Instruct

Alibaba

Q: 16.1$0.23/1M131K ctx

Faster: 180%Cheaper: 69%

Compare all 7 models