Related Models
Pricing
Input
$0.03
per 1M tokens
Output
$0.15
per 1M tokens
Blended
$0.06
per 1M tokens
Cheaper than 88% of models. Median price is $0.54/1M tokens.
Cost Calculator
Tokens per day1M
100K100M
Daily
$0.06
Monthly
$1.80
vs. Similar Models
DeepSeek R1Q:0.0
$1.15+1817%
Gemini 2.5 Flash (Reasoning)Q:0.0
$0.85+1317%
Gemma 4 26B A4B (Non-reasoning)Q:0.0
$0.20+230%
Qwen3.5 9B (Non-reasoning)Q:+0.2
$0.08+33%
Performance
23
tokens/sec
Faster than 0% of models
0.52
seconds
Faster than 86% of models
87.96
seconds
Faster than 1% of models
Market Median
94 tok/s
76% slower
Median TTFT
1.11s
53% faster
Throughput/Dollar
381
tok/s per $/1M
Speed Comparison
ERNIE 4.5 300B A47B
24 tok/s+3%
Gemma 3 12B Instruct
26 tok/s+14%
MoonshotAI: Kimi K2 0711
27 tok/s+17%
Benchmarks
MMLU-ProNot evaluated
GPQA Diamond
77.1%
HLE
7.8%
LiveCodeBenchNot evaluated
SciCode
16.1%
TerminalBench Hard
18.2%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
52.0%
Long Context Recall
55.7%
Tau2
92.1%
Market AverageTop Score
Open Source
apache-2.04B
Downloads
8.8M
Likes
695
VRAM (FP16)
8-16 GB
GPU
RTX 4070 / M2 Pro