Synrouter Docs

Pricing & Usage

Synrouter uses simple per-token pricing in USD. All prices are per million tokens. Cache reads are heavily discounted — typically 90% cheaper than regular input — thanks to Synrouter's automatic cache_control injection.

Pricing overview

Synrouter charges by token usage at competitive rates. Pricing is transparent — you pay for the tokens you consume, with no hidden fees or minimum commitments. All prices are in USD per million tokens.

  • Input tokens — tokens in your prompt, including system messages, tools, and conversation history.
  • Output tokens — tokens generated by the model in response.
  • Cache read tokens — input tokens served from cache at a steep discount. Synrouter automatically injects cache_control breakpoints to maximize cache hit rates for agent workloads.
  • Cache write tokens — tokens written to cache, typically priced at 125% of the input rate for Anthropic models and 100% for others.

Model pricing

The table below shows pricing for all models currently available through Synrouter. For live pricing, call GET /api/sr/pricing or visit the Models page.

ModelInput / MTokOutput / MTokCache Read / MTokContext
Anthropic
Claude Sonnet 4.6$3.00$15.00$0.301M
Claude Opus 4.6$5.00$25.00$0.501M
Claude Opus 4.7$5.00$25.00$0.501M
Claude Opus 4.6 Fast$30.00$150.00$3.001M
Claude Opus 4.7 Fast$30.00$150.00$3.001M
Claude Haiku 4.5$1.00$5.00$0.10200K
DeepSeek
DeepSeek V4 Pro$0.43$0.87$0.00361M
DeepSeek V4 Flash$0.10$0.20$0.021M
DeepSeek V4 Flash (Free)1M
Google
Gemini 3.5 Flash$1.50$9.00$0.151M
Gemini 3.1 Flash Lite$0.25$1.50$0.031M
Gemini 3.1 Pro$2.00$12.00$0.201M
Gemini 3.1 Flash Image$0.50$3.00131K
MiniMax
MiniMax M2.7$0.28$1.20205K
Moonshot AI
Kimi K2.6$0.73$3.49$0.25262K
OpenAI
GPT-5.5 Pro$30.00$180.001M
GPT-5.5$5.00$30.00$0.501M
GPT-5.4 Image 2$8.00$15.00$2.00272K
Qwen
Qwen 3.6 Max$1.04$6.24262K
Qwen 3.6 Flash$0.19$1.131M
xAI
Grok 4.3$1.25$2.50$0.201M
Z.ai
GLM-5.1$0.98$3.08$0.18203K

Cache savings

Synrouter is built for AI agents — workloads that repeat the same system prompts, tool definitions, and conversation prefixes across hundreds of turns. By automatically injecting cache_control breakpoints at strategic positions, Synrouter achieves cache hit rates of 85-95% for coding agents.

  • 4-breakpoint strategy — Synrouter inserts breakpoints at the system message, tool definitions, and two positions in the message history to maximize prefix reuse.
  • Cache reads cost ~10% of regular input for Anthropic models. A 100K-token prompt that is 90% cached costs ~$1.17 instead of ~$3.00 with Claude Sonnet 4.6.
  • Tool result trimming — Synrouter truncates large tool results in history, keeping the head and tail while removing redundant middle content. This reduces input tokens before they reach the upstream model.
  • Savings are reported per-request via the x-synrouter-cache-savings-usd and x-synrouter-trim-savings-usd response headers.

Usage & billing

Track your usage and manage billing from the Synrouter dashboard.

  • Usage dashboard — View per-model token consumption, session activity, and spending trends in real time from the Overview and Activity pages.
  • Prepaid credits — Top up your account with credit packs starting at $25 on the Credits page. Credits are consumed as you make API requests.
  • API keys — Create and manage sk-sr-* API keys from the dashboard. Each key tracks its own usage.
  • Billing history — All top-ups and charges are recorded in your billing history with detailed invoices.