Gemini 3.1 Pro holds at $4.50 while GPT-5.5 variants fragment the top tier
Weekly LLM briefing: GPT-5.5 leads quality but three effort tiers muddy the picture. Gemini 3.1 Pro stays the value play. Kimi K2.6 undercuts everyone.
The three-tier GPT-5.5 problem
GPT-5.5 (OpenAI) still tops quality at 60.2, but the gap between its default, high, and medium effort tiers is now visible enough to matter. GPT-5.5 (high) scores 58.9, GPT-5.5 (medium) drops to 56.7, and all three cost the same $11.25/M tokens. That pricing structure makes no sense for production workloads. If you're running medium effort, you're paying a 2x premium over GPT-5.4 at $5.63/M for nearly identical quality (56.7 vs 56.8). Unless you need the default tier's full 60.2, GPT-5.4 is the rational choice in the OpenAI lineup.
Where the value actually is
The mid-table tells the real story this week. Three models cluster between 53 and 54 quality at radically different price points:
| Model | Quality | Price/M tokens | Speed | Open source |
|---|---|---|---|---|
| Gemini 3.1 Pro Preview | 57.2 | $4.50 | 133 tok/s | No |
| Kimi K2.6 | 53.9 | $1.42 | 44 tok/s | Yes |
| Grok 4.3 | 53.2 | $1.56 | 116 tok/s | No |
| MiMo-V2.5-Pro | 53.8 | $1.50 | 57 tok/s | No |
Kimi K2.6 (MoonshotAI) at $1.42/M is the cheapest model in the top 15 and the only open-source option with quality above 53. For batch inference where inference latency doesn't dominate, it's hard to argue against. The 44 tok/s throughput is the trade-off: tight iteration loops will feel slow.
Grok 4.3 (xAI) offers a different profile. At 116 tok/s and $1.56/M, it's the fastest sub-$2 model by a wide margin. If your pipeline is latency-sensitive and cost-constrained, Grok 4.3 is the pick over Kimi despite the 0.7-point quality gap.
Gemini 3.1 Pro remains the awkward middle
Gemini 3.1 Pro Preview at 57.2 quality and 133 tok/s continues to occupy a strange position: it's 0.4 points behind Claude Opus 4.7 at less than half the price, and it's the fastest model in the entire top tier. For workloads where throughput matters more than squeezing out the last quality point, Gemini 3.1 Pro is the best option under $5/M tokens. I covered this in detail last week, but the math hasn't changed because nothing else has moved.
Stay in the loop
Weekly LLM analysis delivered to your inbox. No spam.