Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/sachnun/hugbucket/llms.txt

Use this file to discover all available pages before exploring further.

Get started

Quick Start

Run HugBucket with Docker in two commands and verify with the AWS CLI.

Docker Deployment

Full deployment guide with port mapping, Docker Compose, and auth configuration.

S3 Protocol

Complete S3 REST API reference with AWS CLI and boto3 examples.

FTP Protocol

FTP gateway reference with path mapping and passive port configuration.

What is HugBucket?

HugBucket is a self-hosted gateway that exposes Hugging Face Storage Buckets over standard S3 and FTP protocols. Any tool that speaks S3 — AWS CLI, boto3, rclone, S3 Browser — or FTP can read and write directly to your Hugging Face data without any code changes. Under the hood, HugBucket implements the full Xet content-addressable storage protocol in pure Python: Gearhash content-defined chunking, BLAKE3 hashing, LZ4 compression, and parallel xorb uploads. Data written through HugBucket is fully compatible with the huggingface_hub Python library and the HF web UI.

Key features

S3-compatible API

Full S3 REST API on port 9000. Works with AWS CLI, boto3, rclone, and any S3-compatible client.

FTP gateway

Standard FTP on port 2121 with /<bucket>/<key> path mapping. Useful for legacy tools.

Native Xet storage

Pure Python Xet CAS: CDC chunking, BLAKE3 hashing, LZ4 compression — compatible with huggingface_hub.

Docker-first

Single image, one MODE variable. Switch between S3 and FTP by changing one environment variable.

Streaming + range requests

HTTP range requests for GetObject with CDN-level skip of irrelevant xorb chunks.

Multipart uploads

S3 multipart upload with idempotent completion and automatic stale-upload cleanup.

Server-side copy

CopyObject reuses the existing content hash — no bytes are re-downloaded or re-uploaded.

Built-in caching

LRU caches for xorb chunks (512 MiB), file metadata, and reconstruction plans.