Qwen 3.5 users push back on verbosity

// 83d agoNEWS

Qwen 3.5 users push back on verbosity

A LocalLLaMA thread argues Qwen 3.5 often over-explains simple prompts and makes “thinking” hard to disable reliably, especially when compared with Gemini 2.5 Flash’s terse answers. The complaint is practical rather than academic: extra reasoning is less useful when it inflates latency and token cost for routine questions.

// ANALYSIS

This is really a UX complaint about model defaults, not just a taste issue about writing style.

–The post frames Qwen 3.5 as capable but inefficient for everyday chat because its answers feel benchmark-shaped instead of user-shaped.
–Qwen’s own model docs emphasize separate thinking and non-thinking modes, which makes the thread notable because it highlights how wrappers and serving setups can still produce verbose behavior in practice.
–For AI developers, this is a reminder that inference UX now matters almost as much as raw model quality: concise answers, controllable reasoning, and predictable output length are product features.
–The comparison to Gemini 2.5 Flash shows why “short by default, detailed on request” is becoming the preferred interaction pattern for fast consumer and developer assistants.

// TAGS

qwen-3.5llmreasoningopen-sourceprompt-engineering

DISCOVERED

83d ago

2026-03-06

PUBLISHED

83d ago

2026-03-06

RELEVANCE

6/ 10

AUTHOR

ashirviskas

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE40m ago

Supabase Auth opens Passkeys public beta

Supabase has opened the Passkeys public beta to all projects, enabling passwordless, phishing-resistant logins via biometrics and hardware keys. Built on the WebAuthn standard, the feature supports discoverable credentials for a "username-less" sign-in experience.

INFRA44m ago

Hippocratic AI hits 99.9% safety on NVIDIA Blackwell

Hippocratic AI achieved 99.9% clinical safety and a 2x prefill speedup using DigitalOcean’s NVIDIA Blackwell-powered AI-Native Cloud. The collaboration demonstrates the real-world performance gains of the HGX B300 for high-concurrency, safety-critical medical agents.

NEWS46m ago

Microsoft debuts homegrown AI coding models

Microsoft is unveiling a suite of in-house AI models at next week's Build conference, led by a new coding model designed to power GitHub Copilot and reduce reliance on OpenAI.