About
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...
Related Models
NVIDIA: Nemotron 3 Ultra (free)2026-06-04NVIDIA: Nemotron 3.5 Content Safety (free)2026-06-04Nemotron 3 Nano Omni 30B A3B Reasoning2026-04-29NVIDIA: Nemotron 3 Nano Omni (free)2026-04-28Nemotron Cascade 2 30B A3B2026-03-19NVIDIA: Nemotron 3 Super2026-03-11NVIDIA: Nemotron 3 Super (free)2026-03-11NVIDIA: Nemotron 3 Nano 30B A3B2025-12-14
Pricing
Input
$0.50
per 1M tokens
Output
$2.50
per 1M tokens
Blended
$1.00
per 1M tokens
Cheaper than 35% of models. Median price is $0.53/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$1.00
Monthly
$30.00
vs. Similar Models
GLM-4.7 (Non-reasoning)
$1.000%
GLM-4.5 (Reasoning)
$1.000%
Perplexity: Sonar
$1.000%
AionLabs: Aion-2.0
$1.000%
Performance
Context Window
1.0M
tokens
Larger than 81% of models
Max Output
16K
tokens
2% of context
Context Window Comparison
Anthropic: Claude Opus 4.8
1.0MSame
Anthropic: Claude Opus 4.7
1.0MSame
Qwen: Qwen3.7 Max
1.0MSame