The bridge layer is the concrete implementation of theDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/sachnun/hugbucket/llms.txt
Use this file to discover all available pages before exploring further.
StorageBackend interface. It translates protocol-agnostic backend calls (like put_object and get_object_stream) into the multi-step sequence of HF Hub API calls and Xet CAS operations required by the Hugging Face storage protocol.
HFStorageBackend (Bridge)
HFStorageBackend in hugbucket/bridge.py is the main class. It is aliased as Bridge for backward-compatible imports:
hugbucket/bridge.py
__post_init__ creates a HubClient and a CASClient and initialises the xorb LRU cache with the configured byte limit.
HubClient
HubClient in hugbucket/hub/client.py is an async HTTP client for the HF Hub Bucket API. It manages a single aiohttp.ClientSession with a configurable connection pool.
XetConnectionInfo carries the CAS URL and short-lived access token returned by the token endpoints:
hugbucket/hub/client.py
X-Xet-Cas-Url, X-Xet-Access-Token, X-Xet-Token-Expiration).
list_buckets and list_bucket_tree follow pagination automatically using the Link: <url>; rel="next" header pattern.
CASClient
CASClient in hugbucket/xet/cas_client.py handles all communication with the Xet CAS endpoint.
upload_xorb and upload_shard retry on transient errors (HTTP 5xx, connection errors, timeouts) with exponential backoff. The default is 3 retries with a 1-second base delay (doubling each attempt):
hugbucket/config.py
Namespace resolution
resolve_namespace() is called once at server startup:
hugbucket/bridge.py
whoami() calls GET /api/whoami-v2 and returns data["name"]. The result is stored in Config.hf_namespace and used by _bucket_id for every subsequent operation.
Bucket ID format
_bucket_id converts a bare bucket name into the {namespace}/{name} format expected by the HF Hub API:
hugbucket/bridge.py
/ (e.g. when targeting an org namespace directly), it is used as-is.
Directory markers
S3 clients create folders by PUTting a zero-byte object with a trailing slash (e.g.my-folder/). HF Storage Buckets use virtual directories inferred from file paths and reject addFile calls for paths ending with /. HugBucket materialises empty folders by storing a hidden sentinel file inside them:
hugbucket/bridge.py
put_object receives a trailing-slash key with zero bytes, it rewrites the key to {key}{DIR_MARKER_FILENAME} and replaces the content with DIR_MARKER_CONTENT before running the normal upload path. Delete operations expand the same way, and list_objects filters marker files from the returned contents while still counting them toward common_prefixes so empty folders show up in listings.
head_directory checks for the marker first (fast path), then falls back to listing objects under the prefix if no marker exists:
hugbucket/bridge.py
Server-side copy
Because Xet uses content-addressable storage,copy_object does not re-download or re-upload any data. It reads the source file’s xet_hash from the file info cache and registers the destination path with the same hash via the Hub batch API:
hugbucket/bridge.py
A server-side copy between two buckets owned by the same namespace is a metadata-only operation. No bytes are transferred from or to the CAS.
Cache invalidation
After any mutation —put_object, delete_object, delete_objects, copy_object — the file info cache entry for the affected key is immediately evicted:
hugbucket/bridge.py
head_object or get_object_stream call while keeping the 30-second TTL in place for read-heavy workloads.