Kwai-Keye drops 30B multimodal MoE with DSA attention

// 45d agoMODEL RELEASE

Kwai-Keye drops 30B multimodal MoE with DSA attention

Kuaishou's Keye team released Keye-VL-2.0-30B-A3B, a 30B-parameter multimodal MoE that integrates DeepSeek Sparse Attention (DSA). The architecture bounds KV cache growth, enabling 256K-token context windows for multi-hour video analysis on consumer hardware.

// ANALYSIS

Bringing DeepSeek Sparse Attention into a multimodal architecture solves the memory explosion problem that traditionally makes long-video reasoning prohibitively expensive.

–DSA restructures how attention weights are stored, preventing the linear KV cache scaling that normally plagues long-context vision models
–The MoE architecture only activates 3B parameters per forward pass, making local inference highly efficient
–Early benchmarks suggest it matches Gemini 1.5 Flash on temporal grounding and outperforms larger open-weight models like Qwen3-VL-235B
–The model introduces the first agent capabilities in the Keye series, supporting visual self-correction and tool use

// TAGS

keye-vl-2.0-30b-a3bllmmultimodalmoelong-contextopen-weightsvision

DISCOVERED

45d ago

2026-05-26

PUBLISHED

45d ago

2026-05-26

RELEVANCE

9/ 10

AUTHOR

External_Mood4719

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL36m ago

Perplexity debuts GLM 5.2 orchestrator with Opus escalation

Arav Srinivas announced an update to Perplexity's internal routing system that utilizes a post-trained GLM 5.2 model as its primary orchestrator. This cost-efficient model handles most queries and escalates complex tasks to Claude Opus, allowing Perplexity to rapidly improve overall answer quality.

LAUNCH44m ago

SaisenAgent launches $SAISEN token on virtuals.io, Robinhood

SaisenAgent has officially launched on virtuals.io, introducing its native token $SAISEN, which is now available on Robinhood. It is described as an autonomous, competitive AI entity capable of reasoning, adapting, and earning under the same constraints as a human, moving beyond traditional AI agent capabilities.

UPDATE46m ago

ChatGPT Work adds Picture-in-Picture monitoring

OpenAI has introduced a Live Picture-in-Picture (PiP) feature to ChatGPT Work's Computer Use agent, letting users monitor active desktop agent sessions in a floating, always-on-top window. The PiP interface displays real-time actions like keystrokes and clicks while providing direct controls to pause, resume, or approve agent actions.