Mistral adds diarization, timestamps to Voxtral Mini
Mistral’s latest audio release brings a more capable batch transcription model plus a better way to try it. Voxtral Mini Transcribe V2 adds speaker diarization, context biasing for names and jargon, word-level timestamps, multilingual support across 13 languages, and the ability to handle up to 3 hours of audio in one request. Mistral is also shipping a new audio playground in Mistral Studio, where developers can upload files, tweak transcription settings, and test outputs directly in the browser.
Strong release, especially if you care about real-world transcription rather than demo hype. The model looks tuned for practical production workflows like meetings, interviews, subtitles, and compliance archives.
- –The biggest win is the combo of diarization and word-level timestamps, which makes the output much more usable downstream.
- –Context biasing is a nice quality-of-life feature for names, acronyms, and domain terms that usually get mangled.
- –Support for long audio files up to 3 hours makes it feel aimed at serious batch jobs, not just short clips.
- –The Mistral Studio playground lowers friction for evaluation, which is smart for dev adoption.
- –If Mistral’s pricing and accuracy claims hold up in practice, this is a very competitive transcription option.
DISCOVERED
25d ago
2026-03-18
PUBLISHED
25d ago
2026-03-18
RELEVANCE
AUTHOR
Mistral AI