JarvisQ runs multilingual voice assistant on phone
JarvisQ is an open-source, on-device voice assistant built on the Tether QVAC SDK that keeps the full STT → LLM → TTS pipeline local, with a translation mode layered on top. The project targets Android first, with a desktop build path for Windows, macOS, and Linux through a shared hexagonal core. Its stack leans on Parakeet TDT v3 for multilingual speech recognition, Qwen3 1.7B/4B GGUF via llama.cpp for inference, Supertonic ONNX or system TTS for output, and Bergamot for real-time translation, with P2P model distribution and HTTPS fallback to handle deployment.
This reads like a serious feasibility demo rather than a polished consumer app, and that’s the interesting part: it proves an all-local speech loop can stay within conversational latency on decent Android hardware.
- –The technical stack is coherent and intentionally optimized for mobile constraints, especially the switch away from Whisper-large-v3.
- –The translation mode is a strong differentiator because it uses a real CPU-friendly multilingual pipeline instead of a cloud detour.
- –The QVAC SDK appears to be doing real work here by unifying Android and desktop execution behind one shared core and adapter model.
- –Biggest caveat: the experience is still constrained by phone-class compute, so quality, latency, and voice naturalness will vary by device and model choice.
- –As an experiment, it is compelling evidence for privacy-first, offline speech agents, but not yet a turnkey product.
DISCOVERED
3h ago
2026-05-06
PUBLISHED
7h ago
2026-05-06
RELEVANCE
AUTHOR
dai_app