[v0.27.0] DDUF tooling, torch model loading helpers & multiple quality of life improvements and bug fixes
📦 Introducing DDUF tooling
DDUF (DDUF's Diffusion Unified Format) is a single-file format for diffusion models that aims to unify the different model distribution methods and weight-saving formats by packaging all model components into a single file. We will soon have a detailed documentation for that.
The huggingface_hub
library now provides tooling to handle DDUF files in Python. It includes helpers to read and export DDUF files, and built-in rules to validate file integrity.
How to write a DDUF file?
>>> from huggingface_hub import export_folder_as_dduf
# Export "path/to/FLUX.1-dev" folder as a DDUF file
>>> export_folder_as_dduf("FLUX.1-dev.dduf", folder_path="path/to/FLUX.1-dev")
How to read a DDUF file?
>>> import json
>>> import safetensors.torch
>>> from huggingface_hub import read_dduf_file
# Read DDUF metadata (only metadata is loaded, lightweight operation)
>>> dduf_entries = read_dduf_file("FLUX.1-dev.dduf")
# Returns a mapping filename <> DDUFEntry
>>> dduf_entries["model_index.json"]
DDUFEntry(filename='model_index.json', offset=66, length=587)
# Load the `model_index.json` content
>>> json.loads(dduf_entries["model_index.json"].read_text())
{'_class_name': 'FluxPipeline', '_diffusers_version': '0.32.0.dev0', '_name_or_path': 'black-forest-labs/FLUX.1-dev', 'scheduler': ['diffusers', 'FlowMatchEulerDiscreteScheduler'], 'text_encoder': ['transformers', 'CLIPTextModel'], 'text_encoder_2': ['transformers', 'T5EncoderModel'], 'tokenizer': ['transformers', 'CLIPTokenizer'], 'tokenizer_2': ['transformers', 'T5TokenizerFast'], 'transformer': ['diffusers', 'FluxTransformer2DModel'], 'vae': ['diffusers', 'AutoencoderKL']}
# Load VAE weights using safetensors
>>> with dduf_entries["vae/diffusion_pytorch_model.safetensors"].as_mmap() as mm:
... state_dict = safetensors.torch.load(mm)
👉 More details about the API in the documentation here.
💾 Serialization
Following the introduction of the torch serialization module in 0.22.*
and the support of saving torch state dict to disk in 0.24.*
, we now provide helpers to load torch state dicts from disk.
By centralizing these functionalities in huggingface_hub
, we ensure a consistent implementation across the HF ecosystem while allowing external libraries to benefit from standardized weight handling.
>>> from huggingface_hub import load_torch_model, load_state_dict_from_file
# load state dict from a single file
>>> state_dict = load_state_dict_from_file("path/to/weights.safetensors")
# Directly load weights into a PyTorch model
>>> model = ... # A PyTorch model
>>> load_torch_model(model, "path/to/checkpoint")
More details in the serialization package reference.
[Serialization] support loading torch state dict from disk by @hanouticelina in #2687
We added a flag to save_torch_state_dict()
helper to properly handle model saving in distributed environments, aligning with existing implementations across the Hugging Face ecosystem:
[Serialization] Add is_main_process argument to save_torch_state_dict() by @hanouticelina in #2648
A bug with shared tensor handling reported in transformers#35080 has been fixed:
add argument to pass shared tensors keys to discard by @hanouticelina in #2696
✨ HfApi
The following changes align the client with server-side updates in how security metadata is handled and exposed in the API responses. In particular, The repository security status returned by HfApi().model_info()
is now available in the security_repo_status
field:
from huggingface_hub import HfApi
api = HfApi()
model = api.model_info("your_model_id", securityStatus=True)
# get security status info of your model
- security_info = model.securityStatus
+ security_info = model.security_repo_status
- Update how file's security metadata is retrieved following changes in the API response by @hanouticelina in #2621
- Expose repo security status field in ModelInfo by @hanouticelina in #2639
🌐 📚 Documentation
Thanks to @miaowumiaomiaowu, more documentation is now available in Chinese! And thanks @13579606 for reviewing these PRs. Check out the result here.
📝Translating docs to Simplified Chinese by @miaowumiaomiaowu in #2689, #2704 and #2705.
💔 Breaking changes
A few breaking changes have been introduced:
RepoCardData
serialization now preservesNone
values in nested structures.InferenceClient.image_to_image()
now takes atarget_size
argument instead ofheight
andwidth
arguments. This is has been reflected in the InferenceClient async equivalent as well.InferenceClient.table_question_answering()
no longer accepts aparameter
argument. This is has been reflected in the InferenceClient async equivalent as well.- Due to low usage,
list_metrics()
has been removed fromHfApi
.
⏳ Deprecations
Some deprecations have been introduced as well:
- Legacy token permission checks are deprecated as they are no longer relevant with fine-grained tokens, This includes
is_write_action
inbuild_hf_headers()
,write_permission=True
in login methods.get_token_permission
has been deprecated as well. labels
argument is deprecated inInferenceClient.zero_shot_classification()
andInferenceClient.image_zero_shot_classification()
. This is has been reflected in the InferenceClient async equivalent as well.
🛠️ Small fixes and maintenance
😌 QoL improvements
- Add utf8 encoding to read_text to avoid Windows charmap crash by @tomaarsen in #2627
- Add user CLI unit tests by @hanouticelina in #2628
- Update consistent error message (we can't do much about it) by @Wauplin in #2641
- Warn about upload_large_folder if really large folder by @Wauplin in #2656
- Support context mananger in commit scheduler by @Wauplin in #2670
- Fix autocompletion not working with ModelHubMixin by @Wauplin in #2695
- Enable tqdm progress in cloud environments by @cbensimon in #2698
🐛 Bug and typo fixes
- bugfix huggingface-cli command execution in python3.8 by @PineApple777 in #2620
- Fix documentation link formatting in README_cn by @BrickYo in #2615
- Update hf_file_system.md by @SwayStar123 in #2616
- Fix download local dir edge case (remove lru_cache) by @Wauplin in #2629
- Fix typos by @omahs in #2634
- Fix ModelCardData's datasets typing by @hanouticelina in #2644
- Fix HfFileSystem.exists() for deleted repos and update documentation by @hanouticelina in #2643
- Fix max tokens default value in text generation and chat completion by @hanouticelina in #2653
- Fix sorting properties by @hanouticelina in #2655
- Don't write the ref file unless necessary by @d8ahazard in #2657
- update attribute used in delete_collection_item docstring by @davanstrien in #2659
- 🐛: Fix bug by ignoring specific files in cache manager by @johnmai-dev in #2660
- Bug in model_card_consistency_reminder.yml by @deanwampler in #2661
- [Inference Client] fix zero_shot_image_classification's parameters by @hanouticelina in #2665
- Use asyncio.sleep in AsyncInferenceClient (not time.sleep) by @Wauplin in #2674
- Make sure create_repo respect organization privacy settings by @Wauplin in #2679
- Fix timestamp parsing to always include milliseconds by @hanouticelina in #2683
- will be used by @julien-c in #2701
- remove context manager when loading shards and handle mlx weights by @hanouticelina in #2709
🏗️ internal
- prepare for release v0.27 by @hanouticelina in #2622
- Support python 3.13 by @hanouticelina in #2636
- Add CI to auto-generate inference types by @Wauplin in #2600
- [InferenceClient] Automatically handle outdated task parameters by @hanouticelina in #2633
- Fix logo in README when dark mode is on by @hanouticelina in #2669
- Fix lint after ruff update by @Wauplin in #2680
- Fix test_list_spaces_linked by @Wauplin in #2707