Qwen3.6-35B-A3B JANG lands on Apple Silicon

// 45d agoMODEL RELEASE

Qwen3.6-35B-A3B JANG lands on Apple Silicon

bearzi shipped a full 15-profile JANG quantization sweep for Qwen3.6-35B-A3B, spanning extreme compression to near-lossless quality. The suite is tuned for Apple Silicon and already loads in vmlx, MLX Studio, and oMLX with a patch pending.

// ANALYSIS

This is more than another quant pack: it’s a practical argument for layer-aware compression on MoE models, where preserving attention precision matters a lot more than squeezing every last weight uniformly.

–The 15 profiles give Mac users a real memory/performance ladder instead of a single compromise build
–Activation-aware calibration plus MSE-all optimization points to quality-first quantization, not just size-chasing
–MoE models are especially sensitive to naive quantization, so JANG’s higher-precision attention treatment is the key technical bet
–Native support in vmlx and MLX Studio lowers the friction for local deployment; oMLX support broadens the ecosystem if the patch lands
–The release also sets up a clear follow-on: if Qwen3-Coder-Next gets the same treatment, local coding workflows could benefit a lot

// TAGS

llminferenceopen-sourceself-hostedqwen3.6-35b-a3b-jang

DISCOVERED

45d ago

2026-04-17

PUBLISHED

45d ago

2026-04-17

RELEVANCE

8/ 10

AUTHOR

PiccoloAcceptable922

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE1h ago

agent-browser CLI measures Core Web Vitals

Chris Tate has announced that agent-browser, an open-source headless browser automation CLI for AI agents, now supports measuring Core Web Vitals directly. Using the new `vitals` command, developers and agents can capture performance metrics like LCP, CLS, and React hydration details in an automated pipeline.

INFRA1h ago

DigitalOcean releases Serverless Inference engineering deep dive

DigitalOcean has published a detailed breakdown explaining how their Serverless Inference works under the hood. The deep dive covers how they handle request routing and ensure reliable, scalable AI model responses without excessive costs, addressing common challenges teams face when deploying models.

VIDEO1h ago

Mastra hosts a featured discussion with Google DeepMind's Ivan Leo on building advanced agentic workflows using Anthropic's Claude Opus 4.8 and the Mastra Agent Builder.

Mastra has highlighted a detailed session featuring Google DeepMind AI engineer Ivan Leo to discuss building advanced agentic workflows using Anthropic's Claude Opus 4.8. The discussion showcases how developers can leverage the Mastra Agent Builder, an enterprise-grade visual studio that allows teams to safely build, chat with, and publish AI agents in a secure TypeScript environment.