OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoMODEL RELEASE
DeepSeek V4 FP4 Demands Datacenter Hardware
This Reddit thread is basically asking the right kind of impractical question: if DeepSeek V4 really is the newly released preview series on Hugging Face, what would it take to run it locally at FP4? Based on the model card, V4-Pro is a 1.6T-parameter MoE model with 49B active params, and V4-Flash is 284B with 13B active params, both with 1M context. That means “local” is technically possible only as a serious hybrid deployment, because the raw FP4 weight footprint alone is enormous before KV cache, runtime overhead, and any offload strategy.
// ANALYSIS
Hot take: if you want DeepSeek V4 on your own hardware, you are shopping for rack-scale infrastructure, not a gaming PC.
- –DeepSeek’s Hugging Face model card lists V4-Pro at 1.6T total parameters and V4-Flash at 284B, both as preview releases with FP4+FP8 mixed precision.
- –Pure FP4 weight storage is still huge: roughly 800GB of raw weights for V4-Pro and roughly 142GB for V4-Flash before overhead, which already pushes this out of normal workstation territory.
- –The only realistic “local” path for V4-Pro is a hybrid setup with multiple high-memory GPUs, very large host RAM, and NVMe spillover; expect performance to fall off fast once you lean on offload.
- –V4-Flash is the more plausible candidate for enthusiasts, but it still wants enterprise-grade GPU memory if you want anything close to responsive inference.
- –If you are paying retail for hardware, the practical budget lands in the five-figure to low six-figure range depending on whether you want “it runs” or “it runs well.”
// TAGS
deepseekdeepseek-v4fp4llmmoelocal-llmgpuinferencehardwarehybrid-offload
DISCOVERED
4h ago
2026-04-24
PUBLISHED
7h ago
2026-04-24
RELEVANCE
8/ 10
AUTHOR
DanielusGamer26