OpenAI charges $11.25/M tokens for GPT-5.5 regardless of whether you select the high or medium reasoning tier. The high tier scores 58.9 on the quality index; medium scores 56.7. That 2.2-point spread, at identical pricing, means every medium-tier call is leaving quality on the table for no cost savings. The interesting question isn't which tier to pick (obviously high, if price is the same). It's whether GPT-5.5's tiers justify their price at all when cheaper models cluster just below them.
The tier gap in context
Let's be precise about what 2.2 quality points means. GPT-5.5 (high) at 58.9 sits 1.3 points below the full GPT-5.5 default at 60.2. GPT-5.5 (medium) at 56.7 lands right next to GPT-5.4, which scores 56.8 at half the price ($5.63/M tokens). That's the crux: medium-tier GPT-5.5 delivers GPT-5.4-level quality at twice the cost.
The speed story reinforces this. High runs at 67 tok/s, medium at 62 tok/s. Neither is fast by current standards. Gemini 3.1 Pro Preview hits 132 tok/s at 57.2 quality and $4.50/M tokens. If your workload is latency-sensitive, the GPT-5.5 tiers are hard to justify on any axis.
The high tier makes sense exactly when you need the best quality OpenAI offers short of the default mode, and you're already committed to the GPT-5.5 pricing envelope. Think complex multi-step reasoning chains where you want strong performance but can tolerate slightly lower fidelity than default. The 1.3-point drop from default to high might be acceptable for batch evaluation pipelines where you're running thousands of calls and want to shave some compute time (high is 3 tok/s slower than default, which adds up at scale but modestly).
Stay in the loop
Weekly LLM analysis delivered to your inbox. No spam.
The medium tier, by contrast, has no obvious use case at $11.25/M. At 56.7 quality, it's statistically equivalent to GPT-5.4 (56.8) which costs $5.63/M and runs at 90 tok/s versus medium's 62 tok/s. If you're building an application that can tolerate 56-57 quality, you should be calling GPT-5.4 and pocketing the $5.62/M difference. At scale, that's not trivial. A workload processing 100M tokens per month saves $562,000 annually by switching from GPT-5.5 medium to GPT-5.4.
The budget alternative nobody's talking about
Xiaomi MiMo-V2.5-Pro (mimo-v2-5-pro) deserves attention here. At 53.8 quality and $1.50/M tokens, it delivers 91% of GPT-5.5 medium's quality at 13% of the cost. The speed is comparable: 65 tok/s versus 62 tok/s. For workloads where quality in the low-to-mid 50s is sufficient — content classification, summarization, first-pass extraction — MiMo-V2.5-Pro is 7.5x cheaper per token than any GPT-5.5 tier.
It's not open source, which limits deployment flexibility compared to Kimi K2.6 (53.9 quality, $1.72/M, 138 tok/s, open source). But MiMo-V2.5-Pro undercuts Kimi on price by $0.22/M while matching it on quality within a tenth of a point. The trade-off is speed: Kimi K2.6 runs at more than double the throughput at 138 tok/s. For latency-critical applications, Kimi wins. For cost-optimized batch processing, MiMo edges ahead.
OpenAI's pricing problem
The deeper issue is that OpenAI's tiered reasoning approach creates a confusing value proposition. Charging the same $11.25/M across default, high, and medium means users have no economic incentive to choose lower tiers. The only reason to select medium would be if it consumed fewer reasoning tokens internally, reducing effective cost per query. But from the API pricing perspective, the sticker price is identical. This makes medium a trap for anyone who doesn't carefully benchmark their specific workload against cheaper alternatives.
Compare this to how the market has stratified. Google offers 57.2 quality at $4.50/M with 132 tok/s throughput. That's better quality than GPT-5.5 medium, 2.5x cheaper, and more than 2x faster. The only models that justify the $10+ price range are the ones scoring above 57: GPT-5.5 default (60.2), GPT-5.5 high (58.9), and Claude Opus 4.7 (57.3 at $10.00/M). Below that threshold, the market offers dramatically better price-performance.
The recommendation
If you're already on GPT-5.5, always use the default or high tier. Medium is strictly dominated by GPT-5.4 on every metric that matters. If you're evaluating whether to use GPT-5.5 at all, the calculus depends on whether you need quality above 57. If yes, GPT-5.5 high at 58.9 is the second-best model available and worth the premium. If your quality threshold is in the 53-57 range, skip the entire GPT-5.5 family and look at GPT-5.4 ($5.63/M), Gemini 3.1 Pro Preview ($4.50/M), or for budget workloads, MiMo-V2.5-Pro ($1.50/M).
Use the LLM Selector to filter by your quality floor and budget ceiling, or browse the full rankings on Explore. The right model isn't the one with the highest score; it's the one where you stop paying for quality you don't use.