HN · HACKER_NEWS// 5h agoMODEL RELEASE

DeepSeek-V4 hits million-token context with MoE efficiency

DeepSeek-AI’s latest MoE release features V4-Pro (1.6T) and V4-Flash (284B) models supporting a 1M-token context length. The architecture uses Hybrid Attention to slash KV cache by 90% and inference FLOPs by 73% compared to V3.2, while setting new open-source records in coding and reasoning benchmarks.

// ANALYSIS

DeepSeek-V4 is a masterclass in efficiency, proving that million-token context can be economically viable through architectural innovation rather than just brute-force compute. Its Hybrid Attention makes long-context inference 10x more memory-efficient than previous generations, while coding performance on LiveCodeBench rivals closed-source giants like Gemini-3.1-Pro. New reasoning modes allow developers to optimize for speed or depth, and the use of the Muon optimizer enables stable training at the 1.6T parameter scale.

// TAGS

deepseek-v4llmmoeopen-sourcereasoning1m-contextmlops

DISCOVERED

5h ago

2026-04-24

PUBLISHED

6h ago

2026-04-24

RELEVANCE

10/ 10

AUTHOR

cmrdporcupine