YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

DeepInfra cuts NVIDIA Nemotron 3 Ultra prices

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

DeepInfra cuts NVIDIA Nemotron 3 Ultra prices
OPEN LINK ↗
// 4d agoPRODUCT UPDATE

DeepInfra cuts NVIDIA Nemotron 3 Ultra prices

DeepInfra has reduced output and cached token prices for the NVIDIA Nemotron 3 Ultra Mixture-of-Experts model. Output prices are now cut to $2.20 per million tokens and cached reads are cut to $0.10 per million tokens.

// ANALYSIS

DeepInfra's price cut intensifies the API pricing war, making large-scale agentic reasoning on open-weights Mixture-of-Experts models significantly more cost-effective.

* By offering the 550B/55B MoE model at these rates, DeepInfra challenges proprietary frontier APIs on raw cost-to-performance metrics.

* The 33% discount on cached tokens is a direct play for developer workflows requiring long-context agentic reasoning and deep research.

* High context limits (256K) combined with low caching costs make multi-turn agent interactions highly viable at scale.

// TAGS
deepinfranvidianvidia-nemotron-3-ultramoeprice-cutai-inferencellmagentic-reasoning

DISCOVERED

4d ago

2026-06-16

PUBLISHED

4d ago

2026-06-16

RELEVANCE

6/ 10

AUTHOR

DeepInfra