Back to Explore
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
NVIDIA·Released 2025-04-08
Open Source131K ctx
About
Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural...
Related Models
NVIDIA: Llama 3.3 Nemotron Super 49B V1.52025-10-10Llama Nemotron Super 49B v1.5 (Reasoning)2025-07-25Llama Nemotron Super 49B v1.5 (Non-reasoning)2025-07-25Llama 3.1 Nemotron Nano VL 8B V12025-06-03Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)2025-05-20Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)2025-04-07Llama 3.3 Nemotron Super 49B v1 (Reasoning)2025-03-18Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)2025-03-18
Pricing
Input
$0.60
per 1M tokens
Output
$1.80
per 1M tokens
Blended
$0.90
per 1M tokens
Cheaper than 38% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.90
Monthly
$27.00
vs. Similar Models
Z.ai: GLM 4.5V
$0.900%
AlfredPros: CodeLLaMa 7B Instruct Solidity
$0.900%
EleutherAI: Llemma 7b
$0.900%
Morph: Morph V3 Fast
$0.900%
Performance
Context Window
131K
tokens
Larger than 27% of models
Context Window Comparison
DeepSeek: DeepSeek V3.2
131KSame
OpenAI: gpt-oss-120b
131KSame
MoonshotAI: Kimi K2 0711
131KSame