Skip to main content
Back to Explore

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1

NVIDIA·Released 2025-04-08
Open Source131K ctx

About

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural...

Pricing

Input

$0.60

per 1M tokens

Output

$1.80

per 1M tokens

Blended

$0.90

per 1M tokens

Cheaper than 38% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.90

Monthly

$27.00

vs. Similar Models

Z.ai: GLM 4.5V
$0.900%
AlfredPros: CodeLLaMa 7B Instruct Solidity
$0.900%
EleutherAI: Llemma 7b
$0.900%
Morph: Morph V3 Fast
$0.900%

Performance

Context Window

131K

tokens

Larger than 27% of models

Context Window Comparison

DeepSeek: DeepSeek V3.2
131KSame
OpenAI: gpt-oss-120b
131KSame
MoonshotAI: Kimi K2 0711
131KSame

Open Source

Quick Compare

Similar Models

Compare all 7 models