Qwen3-Coder-Next tops Qwen3.5 in Claude Code

// 78d agoBENCHMARK RESULT

Qwen3-Coder-Next tops Qwen3.5 in Claude Code

A LocalLLaMA user reports that Qwen3-Coder-Next 80A3B and Qwen3.5 35B both ran at roughly 132K context inside 36GB of combined VRAM, but Coder-Next was far more dependable inside Claude Code. In this side-by-side local test, Qwen3.5 repeatedly stalled mid-job and needed workarounds, while Qwen3-Coder-Next handled tool calls cleanly and felt much closer to Sonnet-level reliability.

// ANALYSIS

Community evals like this are messy, but they matter because agentic coding lives or dies on tool-call stability, not just raw model size or benchmark bragging rights.

–The key result is reliability, not speed: the poster says Qwen3-Coder-Next stayed stable through Claude Code jobs while Qwen3.5 35B often stopped in the middle.
–Both models reportedly fit long context on a dual-GPU 36GB setup, which makes the stability gap more important than the raw 80B-versus-35B comparison.
–That fits Qwen’s broader positioning around coding and agentic workflows, where tool use and long-horizon execution matter more than one-shot code generation.
–It is still anecdotal and hardware-specific, but it is exactly the kind of field report local-first Claude Code users want before burning time on quant and template experiments.

// TAGS

qwen3-coder-nextclaude-codellmai-codingbenchmarkopen-weights

DISCOVERED

78d ago

2026-03-10

PUBLISHED

80d ago

2026-03-08

RELEVANCE

8/ 10

AUTHOR

ikaganacar

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS44m ago

Anthropic readies Opus 4.8 release amid leaks

Rumors of an imminent Claude Opus 4.8 launch swirl as model slugs appear in staging and OpenAI drops stealth updates. The anticipated release signals a pivot toward deeper agentic capabilities and integrated developer workflows.

NEWS52m ago

Pocock: Fewer test seams boost agents

TypeScript authority Matt Pocock argues that minimizing test seams is the key to unlocking AI agent productivity. By focusing on "single-seam" problems like compilers and pure libraries, developers can reduce the architectural "context bounce" that often derails LLM-led refactoring and autonomous coding tasks.

BENCHMARK1h ago

Gemma 4 31B stalls on MacBook M5 Max

Google's Gemma 4 31B model exhibits a 42-second initial latency on Apple M5 Max hardware due to a Flash Attention implementation bug. The bottleneck highlights a critical software-hardware mismatch in the latest hybrid attention architectures.