About
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...
Related Models
Pricing
Input
$0.10
per 1M tokens
Output
$0.32
per 1M tokens
Blended
$0.15
per 1M tokens
Cheaper than 74% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.15
Monthly
$4.65
vs. Similar Models
GLM-4.7-Flash (Non-reasoning)
$0.15-1%
Qwen: Qwen3 30B A3B Thinking 2507
$0.16+3%
MiMo-V2-Flash (Feb 2026)
$0.15-3%
MiMo-V2-Flash (Reasoning)
$0.15-3%
Performance
Context Window
131K
tokens
Larger than 27% of models
Max Output
16K
tokens
13% of context
Context Window Comparison
DeepSeek: DeepSeek V3.2
131KSame
OpenAI: gpt-oss-120b
131KSame
MoonshotAI: Kimi K2 0711
131KSame
Open Source
llama3.370BGGUF / GPTQ / AWQ
Downloads
940.6K
Likes
2.8K
VRAM (FP16)
48-80 GB
GPU
A100 80GB