REDDIT · REDDIT// 4h agoMODEL RELEASE

DeepSeek V4 FP4 Demands Datacenter Hardware

This Reddit thread is basically asking the right kind of impractical question: if DeepSeek V4 really is the newly released preview series on Hugging Face, what would it take to run it locally at FP4? Based on the model card, V4-Pro is a 1.6T-parameter MoE model with 49B active params, and V4-Flash is 284B with 13B active params, both with 1M context. That means “local” is technically possible only as a serious hybrid deployment, because the raw FP4 weight footprint alone is enormous before KV cache, runtime overhead, and any offload strategy.

// ANALYSIS

Hot take: if you want DeepSeek V4 on your own hardware, you are shopping for rack-scale infrastructure, not a gaming PC.

–DeepSeek’s Hugging Face model card lists V4-Pro at 1.6T total parameters and V4-Flash at 284B, both as preview releases with FP4+FP8 mixed precision.
–Pure FP4 weight storage is still huge: roughly 800GB of raw weights for V4-Pro and roughly 142GB for V4-Flash before overhead, which already pushes this out of normal workstation territory.
–The only realistic “local” path for V4-Pro is a hybrid setup with multiple high-memory GPUs, very large host RAM, and NVMe spillover; expect performance to fall off fast once you lean on offload.
–V4-Flash is the more plausible candidate for enthusiasts, but it still wants enterprise-grade GPU memory if you want anything close to responsive inference.
–If you are paying retail for hardware, the practical budget lands in the five-figure to low six-figure range depending on whether you want “it runs” or “it runs well.”

// TAGS

deepseekdeepseek-v4fp4llmmoelocal-llmgpuinferencehardwarehybrid-offload

DISCOVERED

4h ago

2026-04-24

PUBLISHED

7h ago

2026-04-24

RELEVANCE

8/ 10

AUTHOR

DanielusGamer26