Xiaomi slashes MiMo-V2.5 API prices
Xiaomi’s MiMo-V2.5 series is getting a permanent API price reset, with the company saying costs can drop by as much as 99% starting May 27, 2026. The update also simplifies billing by removing input-length pricing differences and boosts Token Plan usage by 5-8x. Xiaomi says the real driver is infrastructure efficiency: lower KV-cache movement, higher cacheable token counts, and better cluster throughput, which let it pass those savings to developers.
Hot take: this looks less like a headline-grabbing discount and more like an inference-cost reset that Xiaomi is using to squeeze the market.
- –Xiaomi says the new pricing goes live globally on May 27, 2026, and can reduce some API costs by up to 99%.
- –The company attributes the cut to engineering gains, including SWA on SGLang HiCache, roughly 5x more cacheable tokens, and better expert-parallel throughput.
- –Pricing is now simpler: no more input-length tiering, which should make budgeting easier for agentic and high-volume workloads.
- –The move strengthens MiMo’s position as a low-cost option for developers building coding, agent, and multimodal products.
DISCOVERED
1h ago
2026-05-26
PUBLISHED
1h ago
2026-05-26
RELEVANCE
AUTHOR
bridgemindai