OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoMODEL RELEASE
Gemma 4, Qwen3.6 beat old 70Bs
Local roleplay users say the newest Gemma and Qwen models are the first small open weights that can genuinely stand in for older 70B-class models. The big wins are speed, lower memory use, and better prompt following, with a few quirks around style consistency and duplication.
// ANALYSIS
The 70B floor is losing relevance fast; for local RP and general use, newer 26B-31B-class models are starting to win on practical quality-per-watt, not just raw size.
- –Gemma 4 is getting the strongest praise here, especially for prompt adherence, longer-session coherence, and cleaner prose than older 70B baselines.
- –Qwen’s newer releases are still competitive, but the thread skews toward Gemma for roleplay style, while Qwen looks stronger as a general-purpose or agentic model family.
- –Memory efficiency matters as much as quality: lower KV-cache demand and better context handling make these models much easier to run on consumer hardware.
- –The tradeoff is still real: some users note token duplication, occasional slop, and the fact that finetunes are still catching up.
- –Net: if you can’t run 70B, that’s less of a handicap than it used to be.
// TAGS
gemma-4qwen3-6llmreasoningmultimodalself-hostedopen-source
DISCOVERED
3h ago
2026-05-01
PUBLISHED
6h ago
2026-05-01
RELEVANCE
8/ 10
AUTHOR
Borkato