YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Medical STT benchmark v4 reshuffles rankings

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Medical STT benchmark v4 reshuffles rankings
OPEN LINK ↗
// 50d agoBENCHMARK RESULT

Medical STT benchmark v4 reshuffles rankings

Omi-Health's updated benchmark evaluates 43 speech-to-text models using a new Medical-WER metric that prioritizes clinically relevant terms. Gemini 3 Pro Preview tops the board, while Microsoft's open-source VibeVoice-ASR 9B outperforms MAI-Transcribe-1.

// ANALYSIS

Standard Word Error Rate (WER) is a dangerous metric for medical AI when it weights filler words equally with life-critical drug names.

  • Medical-WER (M-WER) reveals that top general models often butcher drug names, with error rates 2-5x higher than other categories.
  • Microsoft's VibeVoice-ASR 9B (#3) beating its closed MAI-Transcribe-1 (#11) by 1.7 points highlights the power of LLM-backed transcription.
  • Qwen3-ASR 1.7B emerges as the best small open-source model, delivering near-Gemini performance at 14x the speed of larger models.
  • Deepgram Nova-3 Medical holds its ground as the fastest cloud API, completing files in just 13 seconds without compromising accuracy.
// TAGS
medical-stt-benchmarkbenchmarksttspeechmedicalllmopen-sourcegeminiqwen3vibevoice

DISCOVERED

50d ago

2026-04-08

PUBLISHED

50d ago

2026-04-08

RELEVANCE

8/ 10

AUTHOR

MajesticAd2862