GPT-5.5 takes the quality crown at $11.25, but the real action is at the bottom of the table
GPT-5.5 leads quality at 60.2 but costs 8x more than Kimi K2.6. This week's briefing breaks down who should care.
The big picture
GPT-5.5 (OpenAI) now sits at 60.2 quality index, the highest score on the board. It costs $11.25/M tokens and outputs at 69 tok/s. Whether that quality gap justifies the price depends entirely on your workload, because three models under $2/M tokens are clustering around 53-54 quality and closing fast.
GPT-5.5: quality leader, price outlier
OpenAI's new top model opens a 2.9-point quality gap over Claude Opus 4.7 (57.3) and a 3.0-point gap over Gemini 3.1 Pro Preview (57.2). That's meaningful. But the pricing tells a different story: $11.25/M tokens puts it at 2.5x Gemini 3.1 Pro's $4.50 and nearly 8x Kimi K2.6's $1.44.
The three GPT-5.5 variants are puzzling. The "high" effort mode scores lower (58.9) than the default (60.2), and "medium" drops to 56.7, all at the same $11.25 price. If you're paying premium, stick with the default configuration.
| Model | Quality | Price/M | Speed | Open source |
|---|---|---|---|---|
| GPT-5.5 | 60.2 | $11.25 | 69 tok/s | No |
| Claude Opus 4.7 | 57.3 | $10.00 | 61 tok/s | No |
| Gemini 3.1 Pro | 57.2 | $4.50 | 127 tok/s | No |
| Kimi K2.6 | 53.9 | $1.44 | 41 tok/s | Yes |
| Grok 4.3 | 53.2 | $1.56 | 91 tok/s | No |
The sub-$2 tier is getting crowded and competitive
Kimi K2.6 (MoonshotAI) at 53.9 quality for $1.44/M tokens, Grok 4.3 (xAI) at 53.2 for $1.56, and MiMo-V2.5-Pro (Xiaomi) at 53.8 for $1.50 are all within one quality point of each other. For batch processing, RAG pipelines, or any workload where retries dominate cost, these models deliver roughly 90% of GPT-5.4's quality at 25-27% of its price.
Stay in the loop
Weekly LLM analysis delivered to your inbox. No spam.