OPEN_SOURCE ↗
YT · YOUTUBE// 12d agoPRODUCT LAUNCH
Mistral drops Voxtral Transcribe 2 for high-precision STT
Mistral AI releases Voxtral Transcribe 2, a speech-to-text family featuring real-time streaming with sub-200ms latency and high-accuracy batch processing. The update includes open-weight models under Apache 2.0, supporting 13 languages and precise speaker diarization.
// ANALYSIS
Mistral is effectively commoditizing the voice agent stack by offering open-weight, low-latency alternatives to proprietary giants like ElevenLabs and OpenAI.
- –Voxtral Realtime's sub-200ms latency is a direct shot at ElevenLabs Scribe, aiming for the ultra-responsive voice agent market.
- –Apache 2.0 licensing for the realtime model allows for sovereign, on-premise deployment—a massive win for privacy-conscious enterprises.
- –Integration with Voxtral TTS and Mistral's LLMs provides a complete, high-performance voice-to-voice pipeline without vendor lock-in.
- –Significant cost reduction (one-fifth of ElevenLabs) and competitive accuracy (4% WER) make it a formidable challenger in the STT space.
// TAGS
voxtral-transcribe-2mistral-aispeechsttopen-weightsapireal-timeaudio-gendevtool
DISCOVERED
12d ago
2026-03-30
PUBLISHED
12d ago
2026-03-30
RELEVANCE
9/ 10
AUTHOR
Mistral AI