Cheaper reasoning models cost more in practice
A new paper from Stanford, UC Berkeley, and CMU shows that listed API prices for reasoning models are misleading. Uneven consumption of thinking tokens means theoretically cheaper models often result in higher real-world inference costs.
Relying on sticker price for reasoning models is a trap that could blow up your inference budget.
- –Evaluated eight frontier models across nine tasks, revealing significant cost mismatches
- –Thinking tokens aren't consumed equally, leading to hidden cost overruns
- –"Cost reversals" happen frequently enough to change which model is actually cheapest
- –Developers need to actively monitor real usage rather than assuming a cheaper API tier saves money
DISCOVERED
62d ago
2026-03-26
PUBLISHED
62d ago
2026-03-26
RELEVANCE
AUTHOR
Discover AI