[Upload models] Add docs for mixin (huggingface#1196)

NielsRogge · pcuenca · julien-c · web-flow · commit c72daf0ca9ba · 2024-02-06T16:35:49.000+01:00
* Add docs

* Add formatting

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca &lt;pedro@huggingface.co&gt;

* More improvements

* Update docs

* Address comments

* Add link to upload files

* Apply suggestions from code review

Co-authored-by: Julien Chaumond &lt;julien@huggingface.co&gt;

* web interface first

---------

Co-authored-by: Pedro Cuenca &lt;pedro@huggingface.co&gt;
Co-authored-by: Julien Chaumond &lt;julien@huggingface.co&gt;
Co-authored-by: Wauplin &lt;lucainp@gmail.com&gt;
diff --git a/docs/hub/models-uploading.md b/docs/hub/models-uploading.md
@@ -4,7 +4,13 @@ To upload models to the Hub, you'll need to create an account at [Hugging Face](
 
 You can link repositories with an individual user, such as [osanseviero/fashion_brands_patterns](https://huggingface.co/osanseviero/fashion_brands_patterns), or with an organization, such as [facebook/bart-large-xsum](https://huggingface.co/facebook/bart-large-xsum). Organizations can collect models related to a company, community, or library! If you choose an organization, the model will be featured on the organization’s page, and every member of the organization will have the ability to contribute to the repository. You can create a new organization [here](https://huggingface.co/organizations/new).
 
-There are several ways to upload models to the Hub, described below. We suggest adding a [Model Card](./model-cards) to your repo to document your model.
+There are several ways to upload models to the Hub, described below.
+
+- In case your model comes from a library that has [built-in support](#upload-from-a-library-with-built-in-support), one can use the existing methods.
+- In case your model is a custom PyTorch model, the recommended way is to leverage the [huggingface_hub](#using-the-huggingface_hub-client-library) Python library as it allows to add `from_pretrained`, `push_to_hub` and [automated download metrics](https://huggingface.co/docs/hub/models-download-stats) capabilities to your models, just like models in the Transformers, Diffusers and Timm libraries.
+- In addition to programmatic uploads, you can always use the [web interface](#using-the-web-interface).
+
+Once your model is uploaded, we suggest adding a [Model Card](./model-cards) to your repo to document your model.
 
 ## Using the web interface
 
@@ -66,10 +72,76 @@ Any repository that contains TensorBoard traces (filenames that contain `tfevent
 
 Models trained with 🤗 Transformers will generate [TensorBoard traces](https://huggingface.co/docs/transformers/main_classes/callback#transformers.integrations.TensorBoardCallback) by default if [`tensorboard`](https://pypi.org/project/tensorboard/) is installed.
 
-## Using Git
 
-Since model repos are just Git repositories, you can use Git to push your model files to the Hub. Follow the guide on [Getting Started with Repositories](repositories-getting-started) to learn about using the `git` CLI to commit and push your models.
+## Upload from a library with built-in support
+
+First check if your model is from a library that has built-in support to push to/load from the Hub, like Transformers, Diffusers, Timm, Asteroid, etc.: https://huggingface.co/docs/hub/models-libraries. Below we'll show how easy this is for a library like Transformers:
+
+```python
+from transformers import BertConfig, BertModel
+
+config = BertConfig()
+model = BertModel(config)
+
+model.push_to_hub("nielsr/my-awesome-bert-model")
+
+# reload
+model = BertModel.from_pretrained("nielsr/my-awesome-bert-model")
+```
+
+## Upload a PyTorch model using `huggingface_hub`
+
+In case your model is a (custom) PyTorch model, you can leverage the `PyTorchModelHubMixin` [class](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) available in the [huggingface_hub](https://github.com/huggingface/huggingface_hub) Python library. It is a minimal class which adds `from_pretrained` and `push_to_hub` capabilities to any `nn.Module`, along with download metrics.
+
+Here is how to use it (assuming you have run `pip install huggingface_hub`):
+
+```python
+import torch
+import torch.nn as nn
+from huggingface_hub import PyTorchModelHubMixin
+
+
+class MyModel(nn.Module, PyTorchModelHubMixin):
+    def __init__(self, config: dict):
+        super().__init__()
+        self.param = nn.Parameter(torch.rand(config["num_channels"], config["hidden_size"]))
+        self.linear = nn.Linear(config["hidden_size"], config["num_classes"])
 
-## Using the `huggingface_hub` client library
+    def forward(self, x):
+        return self.linear(x + self.param)
+
+# create model
+config = {"num_channels": 3, "hidden_size": 32, "num_classes": 10}
+model = MyModel(config=config)
+
+# save locally
+model.save_pretrained("my-awesome-model", config=config)
+
+# push to the hub
+model.push_to_hub("my-awesome-model", config=config)
+
+# reload
+model = MyModel.from_pretrained("username/my-awesome-model")
+```
+
+As can be seen, the only thing required is to define all hyperparameters regarding the model architecture (such as hidden size, number of classes, dropout probability, etc.) in a Python dictionary often called the `config`. Next, you can define a class which takes the `config` as keyword argument in its init.
+
+This comes with automated download metrics, meaning that you'll be able to see how many times the model is downloaded, the same way they are available for models integrated natively in the Transformers, Diffusers or Timm libraries. With this mixin class, each separate checkpoint is stored on the Hub in a single repository consisting of 2 files:
+
+- a `pytorch_model.bin` or `model.safetensors` file containing the weights
+- a `config.json` file which is a serialized version of the model configuration. This class is used for counting download metrics: everytime a user calls `from_pretrained` to load a `config.json`, the count goes up by one. See [this guide](https://huggingface.co/docs/hub/models-download-stats) regarding automated download metrics.
+
+It's recommended to add a model card to each checkpoint so that people can read what the model is about, have a link to the paper, etc.
+
+<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/huggingface_hub/mixin_example_bis.png"
+alt="drawing" width="600"/>
+
+<small> Example [repository](https://huggingface.co/LiheYoung/depth_anything_vitl14) that leverages PyTorchModelHubMixin. Downloads are shown on the right.</small>
+
+Visit [the huggingface_hub's documentation](https://huggingface.co/docs/huggingface_hub/guides/integrations) to learn more.
+
+Alternatively, one can also simply programmatically upload files or folders to the hub: https://huggingface.co/docs/huggingface_hub/guides/upload.
+
+## Using Git
 
-The rich feature set in the `huggingface_hub` library allows you to manage repositories, including creating repos and uploading models to the Model Hub. Visit [the client library's documentation](https://huggingface.co/docs/huggingface_hub/index) to learn more.
+Finally, since model repos are just Git repositories, you can also use Git to push your model files to the Hub. Follow the guide on [Getting Started with Repositories](repositories-getting-started#adding-files-to-a-repository-terminalterminal) to learn about using the `git` CLI to commit and push your models.