Synrouter Docs

Pricing & Usage

Synrouter uses simple per-token pricing in USD. All prices are per million tokens. Cache reads are heavily discounted — typically 90% cheaper than regular input — thanks to Synrouter's automatic cache_control injection.

Pricing overview

Synrouter charges by token usage at competitive rates. Pricing is transparent — you pay for the tokens you consume, with no hidden fees or minimum commitments. All prices are in USD per million tokens.

Input tokens — tokens in your prompt, including system messages, tools, and conversation history.
Output tokens — tokens generated by the model in response.
Cache read tokens — input tokens served from cache at a steep discount. Synrouter automatically injects cache_control breakpoints to maximize cache hit rates for agent workloads.
Cache write tokens — tokens written to cache, typically priced at 125% of the input rate for Anthropic models and 100% for others.

Model pricing

The table below shows pricing for all models currently available through Synrouter. For live pricing, call GET /api/sr/pricing or visit the Models page.

Model	Input / MTok	Output / MTok	Cache Read / MTok	Context
Anthropic
Claude Sonnet 4.6	$3.00	$15.00	$0.30	1M
Claude Opus 4.6	$5.00	$25.00	$0.50	1M
Claude Opus 4.7	$5.00	$25.00	$0.50	1M
Claude Opus 4.6 Fast	$30.00	$150.00	$3.00	1M
Claude Opus 4.7 Fast	$30.00	$150.00	$3.00	1M
Claude Haiku 4.5	$1.00	$5.00	$0.10	200K
DeepSeek
DeepSeek V4 Pro	$0.43	$0.87	$0.0036	1M
DeepSeek V4 Flash	$0.10	$0.20	$0.02	1M
DeepSeek V4 Flash (Free)	—	—	—	1M
Google
Gemini 3.5 Flash	$1.50	$9.00	$0.15	1M
Gemini 3.1 Flash Lite	$0.25	$1.50	$0.03	1M
Gemini 3.1 Pro	$2.00	$12.00	$0.20	1M
Gemini 3.1 Flash Image	$0.50	$3.00	—	131K
MiniMax
MiniMax M2.7	$0.28	$1.20	—	205K
Moonshot AI
Kimi K2.6	$0.73	$3.49	$0.25	262K
OpenAI
GPT-5.5 Pro	$30.00	$180.00	—	1M
GPT-5.5	$5.00	$30.00	$0.50	1M
GPT-5.4 Image 2	$8.00	$15.00	$2.00	272K
Qwen
Qwen 3.6 Max	$1.04	$6.24	—	262K
Qwen 3.6 Flash	$0.19	$1.13	—	1M
xAI
Grok 4.3	$1.25	$2.50	$0.20	1M
Z.ai
GLM-5.1	$0.98	$3.08	$0.18	203K

Cache savings

Synrouter is built for AI agents — workloads that repeat the same system prompts, tool definitions, and conversation prefixes across hundreds of turns. By automatically injecting cache_control breakpoints at strategic positions, Synrouter achieves cache hit rates of 85-95% for coding agents.

4-breakpoint strategy — Synrouter inserts breakpoints at the system message, tool definitions, and two positions in the message history to maximize prefix reuse.
Cache reads cost ~10% of regular input for Anthropic models. A 100K-token prompt that is 90% cached costs ~$1.17 instead of ~$3.00 with Claude Sonnet 4.6.
Tool result trimming — Synrouter truncates large tool results in history, keeping the head and tail while removing redundant middle content. This reduces input tokens before they reach the upstream model.
Savings are reported per-request via the x-synrouter-cache-savings-usd and x-synrouter-trim-savings-usd response headers.

Usage & billing

Track your usage and manage billing from the Synrouter dashboard.

Usage dashboard — View per-model token consumption, session activity, and spending trends in real time from the Overview and Activity pages.
Prepaid credits — Top up your account with credit packs starting at $25 on the Credits page. Credits are consumed as you make API requests.
API keys — Create and manage sk-sr-* API keys from the dashboard. Each key tracks its own usage.
Billing history — All top-ups and charges are recorded in your billing history with detailed invoices.