Skip to main content
Back to Explore

NVIDIA: Nemotron 3 Ultra

NVIDIA·Released 2026-06-04
Open Source1.0M ctx

About

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Pricing

Input

$0.50

per 1M tokens

Output

$2.50

per 1M tokens

Blended

$1.00

per 1M tokens

Cheaper than 35% of models. Median price is $0.53/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$1.00

Monthly

$30.00

vs. Similar Models

GLM-4.7 (Non-reasoning)
$1.000%
GLM-4.5 (Reasoning)
$1.000%
Perplexity: Sonar
$1.000%
AionLabs: Aion-2.0
$1.000%

Performance

Context Window

1.0M

tokens

Larger than 81% of models

Max Output

16K

tokens

2% of context

Context Window Comparison

Anthropic: Claude Opus 4.8
1.0MSame
Anthropic: Claude Opus 4.7
1.0MSame
Qwen: Qwen3.7 Max
1.0MSame

Open Source

Quick Compare

Similar Models

Compare all 7 models