OmniVoice TTS model hits ComfyUI for voice cloning

// 113d agoOPENSOURCE RELEASE

OmniVoice TTS model hits ComfyUI for voice cloning

A community-built wrapper integrates k2-fsa's OmniVoice, a zero-shot TTS model supporting over 600 languages, directly into ComfyUI. The node enables high-quality voice cloning from a three-second audio seed, provided users supply an accompanying transcription of the reference audio.

// ANALYSIS

This wrapper highlights ComfyUI's rapid evolution from an image generation tool into a unified hub for complex, multimodal AI workflows.

–The underlying OmniVoice model supports zero-shot cloning across 600+ languages from just a 3-second audio seed.
–While the cloning process requires manual transcriptions of the seed audio, users are automating this by chaining the node with ComfyUI-Whisper.
–With a VRAM footprint of roughly 6.5GB, the model remains highly accessible for local deployment on standard consumer hardware like an RTX 3060.
–The wrapper abstracts away complex dependencies from the base repository, handling automatic downloads for models and transcription pipelines.

// TAGS

omnivoicecomfyuispeechaudio-genopen-source

DISCOVERED

113d ago

2026-04-02

PUBLISHED

113d ago

2026-04-02

RELEVANCE

7/ 10

AUTHOR

Altruistic_Heat_9531

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE2m ago

Cli-Proxy-API Management Center launches WebUI configuration dashboard

Cli-Proxy-API Management Center is an open-source web interface designed to simplify the administration of CLI-Proxy-API instances. It replaces manual YAML configuration file editing with an intuitive visual dashboard for adjusting settings, monitoring runtime status, viewing live logs, and managing token authentication.

VIDEO2h ago

Granola CEO demonstrates OpenAI Codex browser automation

In a video demonstration presented by Every, Granola's CEO showcases OpenAI Codex functioning as an autonomous agent executing complex, multi-step browser workflows. Drawing upon saved user context, Codex navigates web applications and customer support chats to negotiate an internet plan migration and eliminate extra fees.

LAUNCH3h ago

Moonshot AI introduces Kimi K3 Agent Swarm

Moonshot AI has introduced Agent Swarm mode for Kimi K3, a horizontal scaling architecture capable of coordinating up to 300 parallel sub-agents to tackle complex software engineering tasks. By dividing web development across autonomous agent teams working concurrently, the system can generate multi-page websites and frontend applications significantly faster than traditional single-agent approaches.