OPEN_SOURCE ↗
YT · YOUTUBE// 41d agoVIDEO
Qwen 3.5 Medium claims face mixed coding reality.
A Better Stack YouTube benchmark test compared Qwen 3.5 Medium variants (including 35B) against Claude Sonnet 4.5 and found the headline claims only partially hold up in practical coding workflows. The models look impressive on efficiency and benchmark positioning, but real task performance appears uneven.
// ANALYSIS
Qwen’s medium line looks like a serious open-weight efficiency jump, but this test is a reminder that benchmark wins do not automatically translate into smoother day-to-day coding output.
- –The release messaging emphasizes high capability per active parameter, especially for Qwen3.5-35B-A3B on local or lower-cost setups.
- –In hands-on coding tasks, results were mixed rather than decisively Sonnet-level, with noticeable variance by task type.
- –This is still meaningful for developers who prioritize self-hosting, open licensing, and cost control over absolute top reliability.
- –The key adoption question is consistency under real agentic workflows, not just headline benchmark deltas.
// TAGS
qwen-3-5llmai-codingopen-weightsbenchmark
DISCOVERED
41d ago
2026-03-02
PUBLISHED
41d ago
2026-03-02
RELEVANCE
9/ 10
AUTHOR
Better Stack