oMLX DFlash update shows mixed Qwen3 results

// 108d agoBENCHMARK RESULT

oMLX DFlash update shows mixed Qwen3 results

Performance tests of DFlash block-diffusion speculative decoding in oMLX v0.3.5-rc1 show inconsistent results on M2 Max hardware. While Qwen3-Coder-30B-A3B achieved a 21% speedup, the smaller Qwen3.5-9B model saw a 44% slowdown due to draft model overhead.

// ANALYSIS

DFlash's block-diffusion approach is a niche optimization requiring precise model-draft alignment to be effective. Code generation remains the primary use case where block-based predictions justify the overhead, whereas smaller models lack the computational headroom to benefit from the complex verification step. Additionally, compatibility issues with DeltaNet-based architectures currently lead to system crashes.

// TAGS

omlxdflashmlxllmspeculative-decodingqwen3apple-siliconbenchmarks

DISCOVERED

108d ago

2026-04-15

PUBLISHED

108d ago

2026-04-15

RELEVANCE

7/ 10

AUTHOR

CrushingLoss

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE1h ago

OpenWorker launches open-source autonomous desktop agent

OpenWorker is an open-source, local-first autonomous desktop co-worker that operates across local documents, terminal commands, and over 25 third-party integrations. Built to execute end-to-end workflows such as file generation and application updates, OpenWorker supports scheduled recurring background jobs while enforcing explicit human approval for high-consequence actions.

POLICY1h ago

White House formalizes frontier AI evaluation framework

Following closed-door briefings with top AI executives including Sam Altman, the US White House met its August 1st deadline to formalize a pre-release evaluation framework for frontier AI models. The framework introduces new federal pacing guidelines that will shape how developers build, evaluate, and deploy next-generation AI systems.

OPEN SOURCE1h ago

NomaDamas releases k-skill for Korean AI workflows

NomaDamas/k-skill is an open-source project providing a collection of AI agent skills designed specifically for users in South Korea. Built for seamless integration with AI coding assistants like Claude Code and Cursor, k-skill allows agents to interact with localized Korean platforms and services—including KTX/SRT train bookings, KakaoTalk history searches, weather and fine dust reports, package tracking, and stock market lookups—without requiring custom API wrapper setups.