← Back to blog

GPT-5.5 Doubled Codex CLI Costs Overnight — Here's the Math

Synrouter Team6 min read
gpt-5.5codex-cliapi-pricingcost-optimizationopenaiagent-economics

OpenAI shipped GPT-5.5 on April 23, 2026. Six weeks after GPT-5.4. The per-token price doubled.

If you use Codex CLI with an API key, your bill probably doubled with it — and you might not have noticed yet, because GPT-5.5 became the default model in Codex the same day it launched. No migration prompt. No "are you sure?" dialog. Just a quieter wallet.

We instrumented the difference. Here's what the numbers look like.

What Changed: GPT-5.5 vs GPT-5.4 Pricing

ModelInput / 1MOutput / 1MCached Input / 1MContext
GPT-5.4$2.50$15.00$0.251M
GPT-5.5$5.00$30.00$0.501M
GPT-5.5 Pro$30.00$180.00$3.001M

Every line item is exactly 2× GPT-5.4. Cached input doubled too — from $0.25 to $0.50 per million tokens. The 90% cache discount still applies, but it's 90% off a bigger number.

At Batch pricing, GPT-5.5 costs $2.50/$15.00 per million — identical to GPT-5.4 standard. But Batch has a 24-hour latency. Not useful for interactive coding.

Sources: OpenAI launch post, Apidog pricing breakdown, AI Pricing Guru comparison.

What a Codex CLI Session Actually Costs Now

Here's a typical day of Codex CLI usage with an API key. These numbers come from our own instrumentation and match what developers report on r/codex — roughly 1-2M tokens per day for active use, with power users hitting 3M+.

MetricPer DayPer Month (20 dev days)
Uncached input tokens3M60M
Cached input tokens2M40M
Output tokens0.4M8M

Run that through both price tables:

ModelDaily CostMonthly Cost
GPT-5.4$14.00$280
GPT-5.5$28.00$560
GPT-5.5 Pro$168.00$3,360

Same workload. Same token count. Double the bill. The gap widens fast for teams — a 5-developer shop went from ~$1,400/month to ~$2,800/month overnight.

That's the optimistic scenario. It assumes GPT-5.5 uses the same number of tokens as GPT-5.4 for the same tasks. OpenAI says it doesn't.

The "Fewer Tokens" Claim — and Why You Should Verify It

OpenAI's official position: GPT-5.5 completes the same Codex tasks with fewer tokens and fewer retries. Their benchmarks show real gains — Terminal-Bench 2.0 jumped from 75.1% to 82.7%, ARC-AGI-2 from 73.3% to 85.0%.

Fewer retries means fewer wasted tokens. That's real. But "fewer tokens per task" and "lower total bill" are not the same thing, and the community experience is messy.

A thread on the OpenAI Developer Community from late April tells the story:

"Our token usage has gone up 5 fold with 5.5." "Codex 5.5 is consuming tokens as if its drinking gasoline even at thinking medium." "I managed to hit my weekly limit within 2.5 days."

Some of that is exploratory usage — people testing the new model harder than they normally would. Some is GPT-5.5's longer chain-of-thought eating output tokens. The point isn't that OpenAI is lying about token efficiency. It's that you can't assume the benchmark translates to your workload without measuring.

The honest math: even if GPT-5.5 uses 20% fewer tokens than GPT-5.4 for your tasks, you're still paying 60% more (2× price × 0.8 tokens = 1.6× cost). You need roughly 50% token reduction to break even. That's a high bar for routine coding work.

The Cache Silver Lining (and Its Catch)

Cached input on GPT-5.5 costs $0.50/M — half the uncached rate. If your Codex sessions hit a decent cache rate, the effective input cost drops significantly.

But OpenAI's cache behavior has its own surprises. The cache TTL is short, and long-running agent sessions that background a task and resume later can bust the cache entirely — the same problem we documented with Anthropic's 5-minute TTL. The prefix has to be identical. Drift one token and you're paying full price.

For a session with 60% cache hits, the effective input rate is:

text
1Effective input = (0.4 × $5.00) + (0.6 × $0.50) = $2.30/M

Compare that to GPT-5.4 at 60% cache hits:

text
1Effective input = (0.4 × $2.50) + (0.6 × $0.25) = $1.15/M

Still doubled. Cache helps, but it doesn't close the gap.

Three Ways to Stop the Bleeding

1. Pin GPT-5.4 in Codex CLI. The simplest fix. Set model = "gpt-5.4" in ~/.codex/config.toml and your bill drops back to where it was. You lose GPT-5.5's reasoning gains, but for routine refactors, test generation, and boilerplate work, GPT-5.4 is still a strong model. The convergence benchmark data shows GPT-5.4 handles most coding tasks well — the gap matters most on complex multi-step agentic work.

One caveat: Codex CLI versions differ in how they handle model selection. Some versions don't ship GPT-5.5 in the default model catalog at all, meaning you may already be on 5.4 without knowing it. Check with /model inside an interactive session to see what's actually running. There's also a known bug where /clear ignores the config.toml model setting and falls back to gpt-5.4 — so test your config before trusting it.

2. Route by task complexity. Use GPT-5.5 for hard problems (architecture decisions, debugging across files, complex refactors) and GPT-5.4 or GPT-5.2 for everything else. This is what we do internally. Roughly 70% of our Codex turns are routine — the model that wrote them doesn't matter. The other 30% benefit from GPT-5.5's reasoning. Mixed routing cuts the monthly bill by about 40% compared to GPT-5.5-everything.

If you're curious how this compares to the Claude Code side of the fence, our Codex vs Claude Code analysis walks through the same logic with Anthropic's pricing.

3. Put a gateway in front. This is the solution we built. Synrouter sits between Codex CLI and the model providers, automatically routing requests to the cheapest model that can handle the task — and pooling API keys across providers so you never hit a rate limit mid-session.

We built it because our own bill kept doing things we didn't expect. GPT-5.5's silent default switch was just the latest example. The gateway catches these changes: you set a budget, we route within it, and when a provider doubles their prices, your bill doesn't double with it.

If your API spend has been creeping up and you're not sure why, sign up to get started — the first request is on us.


FAQ

Does GPT-5.5 cost more on ChatGPT plans too?

No. ChatGPT Plus, Pro, and Business include GPT-5.5 access at the same subscription price. The cost increase only hits developers using the API directly or through Codex CLI with an API key.

Is GPT-5.5 Pro worth it?

At $30/$180 per million tokens, GPT-5.5 Pro is 6× the standard rate. It posts strong benchmark numbers — but for coding work, the standard GPT-5.5 model covers most use cases. Pro makes sense for research-heavy workloads where the reasoning quality directly impacts outcomes. For everyone else, it's a budget trap.

Can I use GPT-5.5 Batch pricing for Codex CLI?

No. Batch pricing ($2.50/$15.00) requires a 24-hour turnaround. Codex CLI is interactive — you need responses in seconds, not hours. Batch is useful for background processing, not for a coding agent.

Read next: Claude Code API Pricing: Token-Level Cost Breakdown