BACK_TO_FEEDAICRIER_2
Medical STT benchmark v4 reshuffles rankings
OPEN_SOURCE ↗
REDDIT · REDDIT// 4d agoBENCHMARK RESULT

Medical STT benchmark v4 reshuffles rankings

Omi-Health's updated benchmark evaluates 43 speech-to-text models using a new Medical-WER metric that prioritizes clinically relevant terms. Gemini 3 Pro Preview tops the board, while Microsoft's open-source VibeVoice-ASR 9B outperforms MAI-Transcribe-1.

// ANALYSIS

Standard Word Error Rate (WER) is a dangerous metric for medical AI when it weights filler words equally with life-critical drug names.

  • Medical-WER (M-WER) reveals that top general models often butcher drug names, with error rates 2-5x higher than other categories.
  • Microsoft's VibeVoice-ASR 9B (#3) beating its closed MAI-Transcribe-1 (#11) by 1.7 points highlights the power of LLM-backed transcription.
  • Qwen3-ASR 1.7B emerges as the best small open-source model, delivering near-Gemini performance at 14x the speed of larger models.
  • Deepgram Nova-3 Medical holds its ground as the fastest cloud API, completing files in just 13 seconds without compromising accuracy.
// TAGS
medical-stt-benchmarkbenchmarksttspeechmedicalllmopen-sourcegeminiqwen3vibevoice

DISCOVERED

4d ago

2026-04-08

PUBLISHED

4d ago

2026-04-08

RELEVANCE

8/ 10

AUTHOR

MajesticAd2862