Skip to main content
Back to Explore

Nemotron 3 Ultra 550B A55B (Reasoning)

NVIDIA·Released 2026-06-04
Open Source

Pricing

Input

$0.37

per 1M tokens

Output

$1.08

per 1M tokens

Blended

$0.55

per 1M tokens

Cheaper than 50% of models. Median price is $0.56/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.55

Monthly

$16.41

vs. Similar Models

GPT-5.1 (high)Q:0.0
$3.44+529%
GPT-5.4 (low)Q:+0.2
$5.63+928%
Gemini 3 Pro Preview (high)Q:+0.7
$4.50+723%
Grok 4.20 0309 (Reasoning)Q:+0.8
$3.00+448%

Performance

390

tokens/sec

Faster than 98% of models

0.50

seconds

Faster than 85% of models

6.33

seconds

Faster than 42% of models

Market Median

89 tok/s

338% faster

Median TTFT

1.13s

55% faster

Throughput/Dollar

713

tok/s per $/1M

Speed Comparison

Step 3.7 Flash
400 tok/s+2%
Qwen3.5 2B
364 tok/s-7%
gpt-oss-120b (low)
363 tok/s-7%

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
86.7%
HLE
26.6%
LiveCodeBenchNot evaluated
SciCode
39.9%
TerminalBench Hard
36.4%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
81.4%
Long Context Recall
67.0%
Tau2
83.3%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models