X · X// 4h agoBENCHMARK RESULT

OpenRouter finds GPT-5.5 costs surge

OpenRouter replayed GPT-5.4 usage against GPT-5.5 and found real costs rose 49% to 92%, even though the model often returns shorter answers on long prompts. The headline price hike is only partly offset by lower completion length, and mostly for longer-context workloads.

// ANALYSIS

This is the kind of pricing shift that changes default model choices fast: GPT-5.5 may be smarter, but for many teams it is materially more expensive in practice, not just on paper.

–OpenRouter’s switcher-cohort method is useful because it compares the same users before and after the model swap, which is closer to production reality than synthetic benchmarks
–The efficiency gain is workload-dependent: prompts above 10K tokens saw 19-34% fewer completion tokens, while shorter prompts often got longer outputs instead
–For short and medium chats, the cost penalty is harsh enough to make GPT-5.5 a bad default unless the quality lift is obvious
–For long-context workflows, the reduced verbosity softens the blow, but it does not erase the price increase
–The takeaway for developers is simple: route GPT-5.5 selectively, not blindly, and measure cost by prompt bucket instead of averaging everything together

// TAGS

llmbenchmarkpricinginferencelong-contextgpt-5-5openrouter

DISCOVERED

4h ago

2026-05-05

PUBLISHED

4h ago

2026-05-05

RELEVANCE

9/ 10

AUTHOR

OpenRouter