YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

MichiAI hits sub-100ms with promptable ASR

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

MichiAI hits sub-100ms with promptable ASR
OPEN LINK ↗
// 50d agoOPENSOURCE RELEASE

MichiAI hits sub-100ms with promptable ASR

MichiAI is a 530M parameter full-duplex speech LLM that introduces natural language prompt engineering to ASR. By unifying a modified Whisper encoder with a SmolLM backbone, it enables real-time transcription primed with semantic categories and conversation history, achieving ~75ms latency for natural voice agents.

// ANALYSIS

MichiAI’s "listen-head" architecture replaces the serial ASR-LLM-TTS pipeline with a unified multimodal approach that understands input audio. The system moves beyond primitive static word boosting to support semantic category priming and structural expectations via natural language. A real-time feedback loop between audio embeddings and text tokens enables context-based error correction, while a self-prompting mechanism uses conversation history to prioritize expected phonemes and achieve ~75ms latency.

// TAGS
michiaispeechllmprompt-engineeringaudio-genopen-source

DISCOVERED

50d ago

2026-04-24

PUBLISHED

50d ago

2026-04-24

RELEVANCE

8/ 10

AUTHOR

kwazar90