qwen3-asr-swift ships full local speech stack

// 129d agoOPENSOURCE RELEASE

qwen3-asr-swift ships full local speech stack

qwen3-asr-swift is an open-source Swift toolkit for Apple Silicon that bundles ASR, TTS, speech-to-speech, VAD, diarization, alignment, and enhancement into one fully local stack. Its main pitch is practical on-device orchestration: large models run through MLX on GPU while lighter components use CoreML on the Neural Engine, enabling concurrent speech pipelines without cloud dependency.

// ANALYSIS

This is the kind of project that makes Apple Silicon look like a serious edge-AI speech platform rather than just a good Whisper laptop.

–The project goes beyond single-model demos by exposing 11 models behind shared Swift protocols, which makes pipeline composition a real engineering feature instead of a README promise
–Splitting workloads between MLX and CoreML is the sharpest idea here, because it targets the actual bottleneck in local speech apps: resource contention between always-on audio tasks and larger generative models
–The inclusion of diarization, enhancement, alignment, CLI tools, and an HTTP server makes this feel closer to a deployable speech stack than a narrow model wrapper
–Benchmarks like sub-real-time ASR and low-latency streaming TTS matter because they make the repo useful for product builders, not just ML hobbyists
–If the maintainer lands the roadmap items around meeting transcription, streaming diarization, and OpenAI-compatible audio APIs, this could become a strong alternative to fragmented Apple-side speech tooling

// TAGS

qwen3-asr-swiftspeechopen-sourcedevtoolinferenceapi

DISCOVERED

129d ago

2026-03-06

PUBLISHED

129d ago

2026-03-06

RELEVANCE

8/ 10

AUTHOR

ivan_digital

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE1h ago

OpenDesign integrates Meta Muse Spark API

OpenDesign is an open-source, local-first design workspace that can be paired with Meta's Muse Spark to generate code-ready prototypes and UI screens directly from screenshots and prompts. This integration bridges the gap between visual design and software development, providing developers with an interactive workspace to rapidly iterate on AI-generated user interfaces.

UPDATE1h ago

T3 Code updates agent GUI with git worktrees

T3 Code has updated its local-first GUI for orchestrating AI coding agents, adding multi-provider key and subscription management. The release also introduces native support for git worktrees, custom automation actions, and side-by-side split diffs to safely run multiple agent workflows in parallel.

UPDATE2h ago

Grok Build adds multiline input, scrolling

SpaceXAI has released Grok Build versions 0.2.99 and 0.2.98, introducing multiline input and terminal scrolling for its terminal-based AI coding assistant. The updates allow users to input complex prompts directly on the dashboard and scroll through chat histories using PageUp and PageDown.