Mercury 2 enters production on Baseten

// 46d agoMODEL RELEASE

Mercury 2 enters production on Baseten

Mercury 2 is a proprietary diffusion large language model (LLM) developed by Inception Labs that generates tokens in parallel to achieve inference speeds exceeding 1,000 tokens per second on NVIDIA GPUs. Designed for high-speed reasoning and agentic workflows, the model is now available in production on Baseten, with early adopters like Augment Code reporting a 90% reduction in inference costs.

// ANALYSIS

Diffusion-based text generation is a promising paradigm shift that bypasses the sequential bottlenecks of traditional autoregressive models, enabling real-time agentic reasoning at scale.

* Generating 1,000+ tokens per second on standard GPUs dramatically reduces the latency floor for complex multi-agent reasoning loops.

* A 90% cost reduction makes high-frequency model calls economically viable for enterprise coding and reasoning applications.

* Deployment on Baseten simplifies the production serving and scaling process for developer integrations.

// TAGS

llmdiffusion-modelreasoningartificial-intelligencebaseteninception-ai

DISCOVERED

46d ago

2026-06-11

PUBLISHED

46d ago

2026-06-11

RELEVANCE

8/ 10

AUTHOR

_inception_ai

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE55m ago

Synara v0.6.3 adds responsive interrupts and safer undo

Synara has released version 0.6.3, focusing on stability and system resilience. Key improvements include keeping stop and interrupt actions responsive even during heavy workloads, safer undo functionality backed by rescue snapshots, validation, retries, and recovery refs, as well as cleaner provider session resolution during restarts, reconnects, and failures.

UPDATE1h ago

Browser Use demonstrates AI agents filling web forms

Browser Use introduced a demonstration showing AI agents autonomously completing web forms across arbitrary websites. By pairing browser automation with Large Language Models, the platform enables agents to parse input fields, handle user flows, and submit forms without needing pre-existing APIs or manually authored scripts.

LAUNCH1h ago

Tesana generates playable games from single prompt

Tesana highlights its generative AI platform that allows users to create playable interactive games using a single text prompt, eliminating the need for complex game development setups or manual code configuration. Users can instantly play, share, and iterate on AI-generated game worlds directly within the browser.