Multimodal MoE models fail visual reasoning via routing divergence

// 109d agoRESEARCH PAPER

Multimodal MoE models fail visual reasoning via routing divergence

Researchers from Zhejiang University and Alibaba Group reveal that multimodal Mixture-of-Experts models suffer from catastrophic routing divergence in middle layers. The paper demonstrates that while these models correctly perceive images, perceptual signals preemptively hijack cognitive experts, causing reasoning failures.

// ANALYSIS

This paper highlights a fundamental architectural flaw in current multimodal MoE designs — perception overrides cognition instead of collaborating with it.

–Routing divergence occurs in middle layers, preventing deeper cognitive processing
–Models are "Seeing but Not Thinking" because perceptual signals hijack cognitive experts early
–Findings suggest MoE architectures need explicit separation or staging between perception and reasoning layers
–A critical read for AI researchers building next-gen multimodal foundation models

// TAGS

multimodalmoereasoningresearchfoundation-models

DISCOVERED

109d ago

2026-04-12

PUBLISHED

109d ago

2026-04-12

RELEVANCE

8/ 10

AUTHOR

Discover AI

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE20m ago

B.AI launches API resource group with 10% discount

B.AI (TheB.AI) announced the release of its official API Resource Group, featuring a 10% discount designed to support developers, startups, and enterprise users. As AI integration accelerates, B.AI aims to address the demand for accessible, high-performance, and cost-managed AI infrastructure by streamlining API access and reducing overall operational expenses.

BENCHMARK24m ago

Merge Gateway cuts LLM costs 65%

Merge released benchmark data showing intelligent model routing cuts average task costs by 65% ($2.87 vs $8.17) while preserving 99.6% accuracy compared to fixed Opus 4.8. Routing overhead remained minimal with a median latency of 90–650ms per request across 120 trials.

MODEL25m ago

PrunaAI launches P-Image-Ideogram on Scenario

PrunaAI has launched P-Image-Ideogram on Scenario, enabling high-speed text-to-image generation with exceptional typography rendering across diverse visual styles. Featuring four thinking modes to balance speed and quality, the model generates 1K and 2K images in as little as 0.4 seconds.