← Back to blog

Claude Code API Pricing: Token-Level Cost Breakdown (June 2026)

Synrouter Team3 min read
claude-codeapi-pricingcost-optimizationanthropiccomparison

Claude Code is the most popular AI coding agent in 2026 — but its real cost is poorly understood. Most developers know the subscription prices ($20 Pro, $200 Max), but the API-direct pricing tells a very different story. Here's the token-level breakdown that matters for anyone running agents at scale.

Model Pricing (Per Million Tokens)

ModelInputOutputCached Input*Context Window
Claude Opus 4.8$5.00$25.00$0.501M
Claude Sonnet 4.6$3.00$15.00$0.301M
Claude Haiku 4.5$0.80$4.00$0.08200K

*Cached input: Anthropic's prompt caching discount (90% off standard input price). Cache TTL: 5 minutes.

What this means per turn — a typical coding agent turn sends ~20,000 input tokens and generates ~1,000 output tokens:

ModelUncachedWith Cache Hit
Opus 4.8$0.125$0.035
Sonnet 4.6$0.075$0.021
Haiku 4.5$0.020$0.006

That $0.075 per turn (Sonnet, uncached) sounds cheap. Now multiply by a real session.

The Session Multiplier

A typical Claude Code coding session has a specific token shape:

Turns per session: 30-100 (varies by task complexity) Input tokens per turn: 15,000-50,000 (grows with conversation) Output tokens per turn: 500-2,000 Cache hit rate: 40-60% (real-world, not theoretical)
Session TypeTurnsAvg InputCache RateCost (Sonnet 4.6)Cost (Opus 4.8)
Quick fix1512,00060%$0.47$0.81
Feature (medium)5025,00050%$3.23$5.56
Feature (complex)10040,00040%$11.10$18.60
Full-day session20050,00035%$31.93$53.75

A medium feature costs $3-6 in raw API tokens. That's the number to compare against the $200/month Max plan.

Subscription vs API: The Break-Even Math

PlanMonthly CostEquivalent API Tokens (Sonnet)Who It's For
Pro ($20)$20~6 medium featuresLight daily use
Max 5x ($100)$100~30 medium featuresProfessional daily use
Max 20x ($200)$200~60 medium featuresHeavy all-day coding

The catch: Max plans have usage caps. Anthropic doesn't publish exact limits, but heavy users report hitting them. If you exceed the cap, you either wait or switch to API — at which point you're paying full token prices.

Where the Cost Actually Goes

Instrumenting real Claude Code sessions reveals the token distribution:

Token Source% of TotalCache-Sensitive?
System prompt + tool definitions18%Yes (stable prefix)
Conversation history (accumulated)42%No (grows every turn)
Tool outputs (grep, cat, typecheck, etc.)30%No (new each turn)
Model output (code, explanations)10%N/A

The key insight: 72% of tokens come from conversation history and tool outputs — content that's unique to each session and can't be cached. This is the Agent Tax we wrote about in our previous article.

Practical Takeaways

  1. Use Sonnet for coding, Opus for architecture. Opus costs 66% more per turn but is ~15% better on complex reasoning. For boilerplate code and refactoring, Sonnet is the right default.

  2. Keep sessions focused. A 200-turn mega-session costs 6x more than four 50-turn sessions because of context accumulation. Close and reopen sessions between tasks.

  3. Prompt caching saves 40-60%, not 90%. The theoretical maximum requires a perfectly stable prefix, which agent sessions rarely maintain.

  4. The Max plan subsidy is real. If you're a heavy user, the $200/month Max plan is dramatically cheaper than equivalent API usage ($200 vs $30-50 in API tokens). Just don't hit the cap.

Tracking your actual Claude Code token costs? Synrouter gives you per-session cost visibility and reduces your API bill by up to 85% with session-lifetime caching — no agent-side changes required.