Hume open-sources TADA speech model
Hume AI has open-sourced TADA, a speech-language model that aligns one text token to one acoustic vector to speed up text-to-speech generation and eliminate skipped or hallucinated words by design. The release includes 1B and 3B Llama-based models, a GitHub repo, Hugging Face weights, and a paper showing 0.09 real-time factor generation with zero hallucinations on 1,000+ LibriTTSR test samples.
This is the kind of voice-model release developers should pay attention to: not just higher quality, but a smarter architecture that attacks latency and reliability at the tokenization level.
- –TADA’s core trick is 1:1 text-acoustic alignment, which sidesteps the token explosion that slows most LLM-based TTS systems
- –Hume claims more than 5x faster generation than comparable systems, plus zero hallucinations in its test setup, which is a big deal for production voice agents
- –The open release looks unusually usable for developers, with MIT-licensed code, pip install support, Hugging Face checkpoints, and multilingual examples
- –The on-device angle matters: a lighter speech stack could make private, low-latency voice interfaces much more practical on phones and edge hardware
- –The caveat is that TADA is still pretrained mainly for speech continuation, so assistant-style use cases will likely need extra fine-tuning before this becomes a drop-in voice agent backbone
DISCOVERED
77d ago
2026-03-11
PUBLISHED
77d ago
2026-03-11
RELEVANCE
AUTHOR
smusamashah