Unsloth Qwen3.5-35B IQ4_XS tops 100 t/s

// 69d agoBENCHMARK RESULT

Unsloth Qwen3.5-35B IQ4_XS tops 100 t/s

A Reddit user says Unsloth’s Qwen3.5-35B-A3B-UD-IQ4_XS now runs cleanly in the latest Ooba build, hitting around 100 tokens/sec on a 3090 with a huge context window. Their 3D Snake demo suggests the bigger win is not flash, but a local model that can stay on task and actually finish a bounded coding job.

// ANALYSIS

This is the kind of local-model result that matters: not leaderboard bragging rights, but a fast enough, obedient enough model that feels usable in real workflows.

–Roughly 100 t/s on a 3090 changes the experience from “offline batch run” to “interactive assistant,” which is a big deal for coding and agent loops
–The key signal is persistence under iteration: the model reportedly fixed mistakes and delivered a working Three.js demo after other models kept breaking the app
–That makes it a strong candidate for agentic tooling like Cline, where multi-step follow-through matters more than one-shot cleverness
–Unsloth’s own Qwen3.5 docs position the family around 256K context, so this quant sits in a sweet spot of speed, memory efficiency, and long-context practicality
–The caveat is scope: this is a strong anecdote, not a broad eval suite, so repo-scale refactors and tool-use reliability still need wider testing

// TAGS

llminferencegpubenchmarkself-hostedunslothqwen3.5-35b-a3b

DISCOVERED

69d ago

2026-03-20

PUBLISHED

69d ago

2026-03-20

RELEVANCE

9/ 10

AUTHOR

EuphoricPenguin22

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO2h ago

Viral video teases Claude Opus 4.8

A viral video directed by Miguel07Code showcases impressive "hyperframes" camera movements, allegedly generated by Claude Opus 4.8. The post has sparked speculation about Claude's video generation capabilities.

LAUNCH2h ago

Browser Use Terminal launches Rust web-agent TUI

Browser Use Terminal is a new Rust-based TUI that lets developers automate and steer browser tasks directly from the command line. It combines a lightweight LLM harness with direct CDP control over Chrome for highly observable, interactive automation.

NEWS2h ago

Developer automates BTC trading with Claude, nets profit

A developer tasked Claude with a $20 budget to autonomously trade Bitcoin overnight, resulting in a completed script that successfully executed five trades for a $95 profit. The experiment showcases the increasing capability of LLMs to generate functional, profitable algorithmic trading systems with minimal oversight.