Support Huggingface models from safetensors

### 🚀 The feature, motivation and pitch

There are many models on Huggingface that are published as `safetensors` rather than `model.pth` checkpoints. The request here is to support converting and loading those checkpoints into a format that is usable with `torchchat`.

There are several places where this limitation is currently enforced:

* [_download_hf_snapshot](https://github.com/pytorch/torchchat/blob/main/torchchat/cli/download.py#L36) method explicitly ignores `safetensors` files.
* [convert_hf_checkpoint](https://github.com/pytorch/torchchat/blob/main/torchchat/cli/convert_hf_checkpoint.py#L44) explicitly looks for `pytorch_model.bin.index.json` which would be named differently for models that use `safetensors` (e.g. `model.safetensors.index.json`)
* [convert_hf_checkpoint](https://github.com/pytorch/torchchat/blob/main/torchchat/cli/convert_hf_checkpoint.py#L99) only supports `torch.load` to load the `state_dict` rather than `safetensors.torch.load`

### Alternatives

Currently, this `safetensors` -> `model.pth` can be accomplished manually after downloading a model locally, so this could be solved with documentation instead of code.

### Additional context

This issue is a piece of the puzzle for adding support for Granite Code 3b/8b which use the `llama` architecture in `transormers`, but take advantage several pieces of the architecture that are not currently supported by `torchchat`. The work-in-progress for Granite Code can be found on my fork: https://github.com/gabe-l-hart/torchchat/tree/GraniteCodeSupport

### RFC (Optional)

I have a working implementation to support `safetensors` during download and conversion that I plan to submit as a PR. The changes address the three points in code referenced above:

1. Allow the download of `safetensors` files in `_download_hf_snapshot`
    * I'm not yet sure how to avoid double-downloading weights for models that have both `safetensors` and `model.pth`, so will look to solve this before concluding the work
2. When looking for the tensor index file, search for all files ending in `.index.json`, and if a single file is found, use that one
3. When loading the `state_dict`, use the correct method based on the type of file (`torch.load` or `safetensors.torch.load`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Huggingface models from safetensors #1249

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support Huggingface models from safetensors #1249

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions