Related Models
NVIDIA: Llama 3.3 Nemotron Super 49B V1.52025-10-10Llama Nemotron Super 49B v1.5 (Non-reasoning)2025-07-25Llama 3.1 Nemotron Nano VL 8B V12025-06-03Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)2025-05-20NVIDIA: Llama 3.1 Nemotron Ultra 253B v12025-04-08Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)2025-04-07Llama 3.3 Nemotron Super 49B v1 (Reasoning)2025-03-18Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)2025-03-18
Pricing
Input
$0.10
per 1M tokens
Output
$0.40
per 1M tokens
Blended
$0.17
per 1M tokens
Cheaper than 72% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.17
Monthly
$5.25
vs. Similar Models
Devstral MediumQ:0.0
$0.80+357%
Mistral Small 4 (Non-reasoning)Q:0.0
$0.26+50%
Gemma 4 E4B (Reasoning)Q:+0.1
$0.54+207%
Mistral: Mistral Medium 3Q:+0.1
$0.80+357%
Performance
51
tokens/sec
Faster than 17% of models
0.27
seconds
Faster than 98% of models
39.43
seconds
Faster than 11% of models
Market Median
94 tok/s
46% slower
Median TTFT
1.11s
76% faster
Throughput/Dollar
292
tok/s per $/1M
Speed Comparison
Anthropic: Claude Opus 4.7
51 tok/s-0%
Kimi K2.6 (Non-reasoning)
51 tok/s+1%
Claude 4.5 Sonnet (Reasoning)
52 tok/s+1%
Benchmarks
MMLU-Pro
81.4%
GPQA Diamond
74.8%
HLE
6.8%
LiveCodeBench
73.7%
SciCode
34.8%
TerminalBench Hard
5.3%
MATH-500
98.3%
AIME
86.0%
AIME 2025
76.7%
IFBench
37.0%
Long Context Recall
34.0%
Tau2
28.1%
Market AverageTop Score