BACK_TO_FEEDAICRIER_2
Hugging Face adds Xet-backed Storage Buckets
OPEN_SOURCE ↗
REDDIT · REDDIT// 31d agoINFRASTRUCTURE

Hugging Face adds Xet-backed Storage Buckets

Hugging Face has introduced Storage Buckets, a new S3-like object storage layer on the Hub for checkpoints, raw data, logs, and other mutable ML artifacts. The feature is backed by Xet deduplication, exposed through the `hf` CLI plus Python and JavaScript SDKs, and is aimed at training and data-heavy workflows that do not belong in Git-based repos.

// ANALYSIS

This is Hugging Face filling a real infrastructure gap in its platform stack, not just publishing another docs page. If Buckets work as advertised, they make the Hub more credible as an end-to-end home for active ML pipelines instead of just final model and dataset artifacts.

  • The core pitch is practical: mutable storage for checkpoints, processed shards, traces, and logs that change too often for versioned repos
  • Xet-backed chunk deduplication is a strong fit for ML workloads where successive checkpoints and dataset variants share lots of bytes
  • Native support in the CLI, Python, JavaScript, and `fsspec` means teams can drop this into existing training and data workflows without much glue code
  • Pre-warming data near AWS and GCP compute hints that Hugging Face wants to compete on performance and workflow convenience, not just storage semantics
  • The bigger strategic move is platform expansion: Hugging Face is turning the Hub into operational infrastructure for AI development, not only a publishing surface
// TAGS
hugging-face-storage-bucketsclouddata-toolsapimlops

DISCOVERED

31d ago

2026-03-11

PUBLISHED

31d ago

2026-03-11

RELEVANCE

8/ 10

AUTHOR

qlhoest