OPEN_SOURCE ↗
REDDIT · REDDIT// 4d agoBENCHMARK RESULT
Medical STT benchmark v4 reshuffles rankings
Omi-Health's updated benchmark evaluates 43 speech-to-text models using a new Medical-WER metric that prioritizes clinically relevant terms. Gemini 3 Pro Preview tops the board, while Microsoft's open-source VibeVoice-ASR 9B outperforms MAI-Transcribe-1.
// ANALYSIS
Standard Word Error Rate (WER) is a dangerous metric for medical AI when it weights filler words equally with life-critical drug names.
- –Medical-WER (M-WER) reveals that top general models often butcher drug names, with error rates 2-5x higher than other categories.
- –Microsoft's VibeVoice-ASR 9B (#3) beating its closed MAI-Transcribe-1 (#11) by 1.7 points highlights the power of LLM-backed transcription.
- –Qwen3-ASR 1.7B emerges as the best small open-source model, delivering near-Gemini performance at 14x the speed of larger models.
- –Deepgram Nova-3 Medical holds its ground as the fastest cloud API, completing files in just 13 seconds without compromising accuracy.
// TAGS
medical-stt-benchmarkbenchmarksttspeechmedicalllmopen-sourcegeminiqwen3vibevoice
DISCOVERED
4d ago
2026-04-08
PUBLISHED
4d ago
2026-04-08
RELEVANCE
8/ 10
AUTHOR
MajesticAd2862