Related Models
Pricing
Input
$0.10
per 1M tokens
Output
$0.20
per 1M tokens
Blended
$0.13
per 1M tokens
Cheaper than 80% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.13
Monthly
$3.75
vs. Similar Models
Jamba 1.5 MiniQ:-0.1
$0.25+100%
Jamba 1.6 MiniQ:-0.2
$0.25+100%
Qwen3 1.7B (Reasoning)Q:-0.2
$0.40+218%
Microsoft: Phi 4 Mini InstructQ:+0.2
$0.15+18%
Performance
76
tokens/sec
Faster than 38% of models
0.42
seconds
Faster than 94% of models
0.42
seconds
Faster than 96% of models
Market Median
94 tok/s
20% slower
Median TTFT
1.11s
62% faster
Throughput/Dollar
607
tok/s per $/1M
Speed Comparison
Qwen3 235B A22B 2507 (Reasoning)
75 tok/s-2%
Anthropic: Claude Fable 5
77 tok/s+2%
Z.ai: GLM 4.5 Air
77 tok/s+2%
Context Window
66K
tokens
Larger than 13% of models
Max Output
66K
tokens
100% of context
Benchmarks
MMLU-Pro
52.2%
GPQA Diamond
40.0%
HLE
5.8%
LiveCodeBench
26.6%
SciCode
10.3%
TerminalBench Hard
0.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025
41.3%
IFBench
32.8%
Long Context Recall
0.0%
Tau2
12.6%
Market AverageTop Score
Open Source
apache-2.07BGGUF / GPTQ / AWQ
Downloads
753.2K
Likes
128
VRAM (FP16)
8-16 GB
GPU
RTX 4070 / M2 Pro