Related Models
NVIDIA: Llama 3.3 Nemotron Super 49B V1.52025-10-10Llama Nemotron Super 49B v1.5 (Reasoning)2025-07-25Llama 3.1 Nemotron Nano VL 8B V12025-06-03Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)2025-05-20NVIDIA: Llama 3.1 Nemotron Ultra 253B v12025-04-08Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)2025-04-07Llama 3.3 Nemotron Super 49B v1 (Reasoning)2025-03-18Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)2025-03-18
Pricing
Input
$0.10
per 1M tokens
Output
$0.40
per 1M tokens
Blended
$0.17
per 1M tokens
Cheaper than 72% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.17
Monthly
$5.25
vs. Similar Models
Llama 3.3 Instruct 70BQ:-0.1
$0.61+250%
Mistral Small 3.1Q:-0.1
$0.14-21%
OpenAI: GPT-4o (2024-05-13)Q:-0.1
$7.50+4186%
Qwen3 32B (Non-reasoning)Q:-0.1
$0.26+49%
Performance
52
tokens/sec
Faster than 21% of models
0.26
seconds
Faster than 98% of models
0.26
seconds
Faster than 100% of models
Market Median
94 tok/s
44% slower
Median TTFT
1.10s
77% faster
Throughput/Dollar
298
tok/s per $/1M
Speed Comparison
Llama Nemotron Super 49B v1.5 (Reasoning)
52 tok/s-0%
Qwen: Qwen3.5 397B A17B
52 tok/s-0%
MiniMax: MiniMax M2.7
52 tok/s+0%
Benchmarks
MMLU-Pro
69.2%
GPQA Diamond
48.1%
HLE
4.3%
LiveCodeBench
29.0%
SciCode
23.8%
TerminalBench Hard
3.8%
MATH-500
77.0%
AIME
13.7%
AIME 2025
8.0%
IFBench
32.9%
Long Context Recall
22.0%
Tau2
25.1%
Market AverageTop Score