Related Models
NVIDIA: Nemotron 3 Ultra2026-06-04NVIDIA: Nemotron 3 Ultra (free)2026-06-04NVIDIA: Nemotron 3.5 Content Safety (free)2026-06-04Nemotron 3 Nano Omni 30B A3B Reasoning2026-04-29NVIDIA: Nemotron 3 Nano Omni (free)2026-04-28Nemotron Cascade 2 30B A3B2026-03-19NVIDIA: Nemotron 3 Super2026-03-11NVIDIA: Nemotron 3 Super (free)2026-03-11
Pricing
Input
$0.37
per 1M tokens
Output
$1.08
per 1M tokens
Blended
$0.55
per 1M tokens
Cheaper than 50% of models. Median price is $0.56/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.55
Monthly
$16.41
vs. Similar Models
GPT-5.1 (high)Q:0.0
$3.44+529%
GPT-5.4 (low)Q:+0.2
$5.63+928%
Gemini 3 Pro Preview (high)Q:+0.7
$4.50+723%
Grok 4.20 0309 (Reasoning)Q:+0.8
$3.00+448%
Performance
390
tokens/sec
Faster than 98% of models
0.50
seconds
Faster than 85% of models
6.33
seconds
Faster than 42% of models
Market Median
89 tok/s
338% faster
Median TTFT
1.13s
55% faster
Throughput/Dollar
713
tok/s per $/1M
Speed Comparison
Step 3.7 Flash
400 tok/s+2%
Qwen3.5 2B
364 tok/s-7%
gpt-oss-120b (low)
363 tok/s-7%
Benchmarks
MMLU-ProNot evaluated
GPQA Diamond
86.7%
HLE
26.6%
LiveCodeBenchNot evaluated
SciCode
39.9%
TerminalBench Hard
36.4%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
81.4%
Long Context Recall
67.0%
Tau2
83.3%
Market AverageTop Score