BACK_TO_FEEDAICRIER_2
DenseVault dedupes training checkpoints over WebDAV
OPEN_SOURCE ↗
REDDIT · REDDIT// 18d agoOPENSOURCE RELEASE

DenseVault dedupes training checkpoints over WebDAV

DenseVault is a single-file, zero-dependency Python write-once-read-many archive that uses content-defined chunking, delta encoding, and entropy-aware compression to store versioned files efficiently over WebDAV. The maker built it for AI training checkpoints and other large binaries, and says one checkpoint set shrank from 9.1 GB to 5.1 GB.

// ANALYSIS

This feels like a genuinely useful MLOps storage layer, not just a compression demo: the big win is keeping checkpoint sprawl mounted and reusable instead of turning it into dead cold storage. Its sweet spot is versioned, partially redundant artifacts; once data is already compressed or needs random access, the gains narrow fast.

  • The 9.1 GB to 5.1 GB checkpoint result is the right benchmark because it matches the exact workload DenseVault targets.
  • WebDAV plus range reads is the killer workflow win: existing tools can mount the vault, and even `llamafile` can stream GGUF models straight from it.
  • Entropy-aware compression is a smart guardrail, and the Arch ISO test shows why: already-compressed blobs barely benefit.
  • Delta mode fits model checkpoints that are read whole, but it is a bad fit for live inference files because reconstruction gets in the way of range reads.
  • The single-file SQLite/WORM design is portable and low-friction, but it will need serious durability and concurrency testing if it grows beyond a thesis project.
// TAGS
densevaultllmmlopsdata-toolsopen-sourceself-hosted

DISCOVERED

18d ago

2026-03-25

PUBLISHED

18d ago

2026-03-25

RELEVANCE

7/ 10

AUTHOR

FiddleSmol