BACK_TO_FEEDAICRIER_2
Parakeet TDT v3 adds 25 languages, 3000x speed
OPEN_SOURCE ↗
REDDIT · REDDIT// 8d agoMODEL RELEASE

Parakeet TDT v3 adds 25 languages, 3000x speed

NVIDIA's latest Token-and-Duration Transducer (TDT) models achieve 3000x real-time throughput, with v3 expanding support to 25 languages while v2 remains the precision choice for English-only tasks.

// ANALYSIS

Parakeet TDT is the speed king of ASR, effectively rendering Whisper-based pipelines obsolete for high-volume tasks, but it prioritizes "clean" readability over verbatim audio records. Version 2 remains the precision choice for English-only tasks with a 6.05% WER compared to v3's 6.32% WER. v3's primary value lies in its 25-language multilingual engine and improved robustness against non-speech audio hallucinations. Both models aggressively filter "ums" and "uhs," making them unsuitable for legal or clinical verbatim requirements. Achieving 3000x RTFx means one hour of audio is processed in ~1 second, enabling massive-scale transcription at negligible cost.

// TAGS
speechopen-sourcellmnvidia-parakeet-tdt

DISCOVERED

8d ago

2026-04-04

PUBLISHED

8d ago

2026-04-03

RELEVANCE

8/ 10

AUTHOR

walleynguyen