OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoBENCHMARK RESULT
Qwen3.5-35B-A3B powers flawless 27-step local agent chain
A Reddit user says Qwen3.5-35B-A3B completed a 27-call local video workflow end to end, from Whisper transcription to subtitle burning, without a single error or manual intervention. The whole run stayed on a Lenovo P53 with llama.cpp and whisper.cpp, no cloud APIs, making it a strong real-world demo for a sparse MoE model on mid-range hardware.
// ANALYSIS
MoE is starting to look like a real advantage, not just an architecture footnote. The interesting part here is less that Qwen answered well and more that it held state across a long, messy tool chain and finished the job locally.
- –27 sequential tool calls with verification is a better agent test than a single prompt-response benchmark.
- –The official model card says 35B total parameters and 3B activated, which is exactly the kind of sparsity that makes local deployment plausible.
- –Fully local execution with llama.cpp and whisper.cpp removes cloud latency, cost, and privacy friction.
- –Video-to-subtitles is a good stress test because it mixes planning, file I/O, transcription, and post-processing.
- –Ten minutes end to end is slow, but if it stays reliable, that's a tradeoff many local workflows will happily take.
// TAGS
qwen3.5-35b-a3bllmagentself-hostedopen-weightsautomationinferencebenchmark
DISCOVERED
17d ago
2026-03-25
PUBLISHED
18d ago
2026-03-25
RELEVANCE
9/ 10
AUTHOR
cride20