Loading...
Loading...
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.
Quality Index
24.5
145th of 444
Top 33%
Coding Index
18.5
159th of 354
Top 45%
Math Index
89.3
29th of 268
Top 12%
Price/1M
$0.09
225th cheapest
69% below median
Top 33%
Speed
272 tok/s
Top 3%
TTFT
0.48s
Context Window
131K
145th largest
Top 63%
Input
$0.06
per 1M tokens
Output
$0.20
per 1M tokens
Blended
$0.09
per 1M tokens
Cheaper than 67% of models. Median price is $0.30/1M tokens.
Daily
$0.09
Monthly
$2.82
272
tokens/sec
Faster than 97% of models
0.48
seconds
Faster than 46% of models
7.84
seconds
Faster than 24% of models
Market Median
45 tok/s
499% faster
Median TTFT
0.42s
15% slower
Throughput/Dollar
2891
tok/s per $/1M
Speed Comparison
Context Window
131K
tokens
Larger than 37% of models
Max Output
131K
tokens
100% of context
6.8M
4.5K
24-48 GB
A6000 / M3 Ultra