LocalAI Qwen 3.5 35B benchmark: Vulkan wins

// 95d agoBENCHMARK RESULT

LocalAI Qwen 3.5 35B benchmark: Vulkan wins

LocalAI benchmarked Qwen 3.5 35B MoE variants on Strix Halo and found a clean split: Vulkan led token generation, while ROCm won prompt processing. The tests stretched from zero context to 200K tokens with prefix caching enabled, which makes this a useful read for anyone tuning local inference on AMD hardware.

// ANALYSIS

This is less a universal backend verdict than a workload split, and that matters. If your app is chatty and decode-heavy, Vulkan looks like the better default; if your pipeline is prompt-heavy, ROCm still has an edge.

–The result held across two different Qwen3.5-35B variants, so the backend pattern looks real rather than quant-specific noise.
–Vulkan’s roughly 10-15% generation lead is the number to watch, because token streaming is what users feel in interactive sessions.
–ROCm’s prompt-processing advantage is still meaningful for ingestion, long-context preprocessing, and batch-style local workflows.
–Prefix caching and 200K-context tests make this relevant for agentic use cases, not just short chat prompts.
–For Strix Halo and similar AMD APUs, backend choice should now be workload-specific instead of assumed.

// TAGS

localaillmgpuinferencebenchmarkself-hosted

DISCOVERED

95d ago

2026-04-08

PUBLISHED

95d ago

2026-04-08

RELEVANCE

8/ 10

AUTHOR

pipould

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE1h ago

C# PS5 emulator SharpEmu boots 2D games

SharpEmu is an experimental, open-source PlayStation 5 emulator written in C# that targets Windows, Linux, and macOS. In its early development stages, the project has successfully booted simple 2D games like Dreaming Sarah and shown initial progress loading complex titles such as Demon's Souls Remake.

OPEN SOURCE1h ago

background-agents launches multi-repo coding agents

background-agents is an open-source platform for running autonomous coding agents asynchronously in cloud sandboxes. Built on Cloudflare, Modal, and Daytona, the system enables agents to perform long-running tasks like security audits and migrations across multiple repositories.

OPEN SOURCE1h ago

FlClash is a multi-platform proxy client based on ClashMeta, offering a simple, open-source, and ad-free interface.

FlClash is an open-source, multi-platform GUI proxy client built on ClashMeta. Developed using Dart and Flutter, it offers a unified, ad-free interface for managing network proxy settings across Android, iOS, Windows, macOS, and Linux. The application aims to provide a user-friendly way to configure and run ClashMeta-based rule routing.