OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoMODEL RELEASE
DeepSeek V4 drops 1.6T parameter MoE
DeepSeek releases V4-Pro and V4-Flash models featuring up to 1.6 trillion parameters and a 1M context window. The open-weights release introduces Sparse Attention for 90% memory reduction and undercuts frontier pricing by 7x.
// ANALYSIS
DeepSeek V4 is a massive escalation in the open-weights arms race, proving that 1T+ parameter models are now the standard for frontier performance.
- –V4-Pro (1.6T) and V4-Flash (284B) directly challenge GPT-5 and Claude 4 with state-of-the-art agentic capabilities.
- –DeepSeek Sparse Attention (DSA) and KV-cache compression enable massive memory savings, making 1T+ models viable for broader inference.
- –Pricing at $3.48/M output tokens is roughly 1/7th of competitor costs, likely triggering a price war among major providers.
- –Training on Huawei Ascend 950 hardware marks a significant shift toward domestic Chinese hardware for frontier AI.
- –SOTA agentic performance on SWE-bench suggests a primary focus on deep coding and complex reasoning tasks.
// TAGS
deepseekllmopen-sourcereasoningai-codinginference
DISCOVERED
4h ago
2026-04-25
PUBLISHED
6h ago
2026-04-25
RELEVANCE
10/ 10
AUTHOR
crowtain