Skip to main content
Back to Blog

GPT-5.4 sits in no-man's land: too expensive to be cheap, too weak to justify the premium

OpenAI's GPT-5.4 scores 56.8 quality at $5.63/M tokens. Gemini 3.1 Pro nearly matches it for 20% less. A pricing analysis.

FindLLMMay 7, 2026
gpt-5-4openaipricing-analysismodel-comparison

GPT-5.4 (OpenAI) occupies an awkward position in the current model landscape. At 56.8 quality and $5.63/M tokens, it delivers marginally less than Gemini 3.1 Pro's quality while costing 25% more, and sits 3.4 points below GPT-5.5 without offering meaningful savings relative to that gap. For most production workloads, GPT-5.4 is the model you skip.

The pricing squeeze

The problem is straightforward. Gemini 3.1 Pro Preview (Google) scores 57.2 quality at $4.50/M tokens. GPT-5.4 scores 56.8 at $5.63/M tokens. That's 0.4 points less quality for $1.13 more per million tokens. In any scenario where you're processing volume — batch summarization, RAG pipelines, document classification — that $1.13 compounds into real budget pressure with no quality upside.

The speed differential makes it worse. Gemini 3.1 Pro runs at 142 tok/s versus GPT-5.4's 85 tok/s. That's 67% faster inference. For latency-sensitive applications like interactive agents or streaming UIs, Gemini completes responses in roughly 60% of the wall-clock time.

ModelQualityPrice/M tokensSpeedCreator
GPT-5.560.2$11.2579 tok/sOpenAI
Claude Opus 4.757.3$10.0064 tok/sAnthropic
Gemini 3.1 Pro Preview57.2$4.50142 tok/sGoogle
GPT-5.456.8$5.6385 tok/sOpenAI
GPT-5.5 (medium)56.7$11.2573 tok/sOpenAI

Where GPT-5.4 might still make sense

I can construct one narrow case. If you're locked into OpenAI's API ecosystem — fine-tuned models, existing prompt libraries, specific function-calling behavior — and GPT-5.5's $11.25/M is too expensive for your throughput, then GPT-5.4 is your best option within that vendor. It's the cheapest OpenAI model above 55 quality.

That's not nothing. Vendor switching costs are real. Prompt engineering that exploits OpenAI-specific behaviors (tool use formatting, system message handling) doesn't port cleanly. If your team has invested months tuning prompts for OpenAI's instruction-following style, paying the $1.13 premium over Gemini might be cheaper than rewriting and revalidating.

But this is a lock-in argument, not a quality argument.

Quality comparison

The gap to GPT-5.5 is real but expensive to close

Moving from GPT-5.4 to GPT-5.5 buys 3.4 quality points at a cost of $5.62 additional per million tokens — exactly doubling your spend. Whether that's worth it depends entirely on your error tolerance. In classification tasks where the quality index correlates with accuracy, 3.4 points might mean 2-3% fewer misclassifications. For a pipeline processing millions of documents with downstream human review, that reduction in error rate could easily justify the cost through reduced rework.

For creative generation or open-ended tasks where quality differences are harder to measure in production, the jump is harder to justify.

Price comparison

The operational verdict

GPT-5.4 is a tweener. It doesn't win on price (Gemini 3.1 Pro is cheaper and faster with marginally higher quality), doesn't win on quality (GPT-5.5 is clearly ahead), and doesn't win on speed (142 tok/s versus 85 tok/s isn't close). Its only defensible position is as the budget option for teams already committed to OpenAI's stack.

If you're choosing fresh, Gemini 3.1 Pro Preview dominates this price tier. If you need peak quality regardless of cost, GPT-5.5 is the answer. GPT-5.4 exists for the teams in between who can't switch vendors and can't afford to double their inference bill.

Use the LLM Selector to filter by your actual constraints — vendor, budget ceiling, minimum quality threshold — and you'll likely land somewhere other than GPT-5.4 unless OpenAI lock-in is a hard requirement.

Stay in the loop

Weekly LLM analysis delivered to your inbox. No spam.