Feature request: support different input/output formats in the same recipe

Today, users have to do manual conversions between `.pth` and `.safetensors` formats before/after fine-tuning with torchtune.

**Example 1: torchtitan -> torchtune -> HF transformers.** torchtitan outputs `.dcp`, which can be converted to `.pt`, but downstream consumers like HF transformers may expect/prefer `.safetensors`.

**Example 2: torchtune -> executorch.** Models on HF hub are often in `.safetensors` format, but executorch expects `.pth`.

Today, users have to manually convert the formats before or after fine-tuning in torchtune to interop with other frameworks. For example 2, @jainapurva had to do the following manually:

```
from torchtune.training import FullModelHFCheckpointer
from torchtune.models import convert_weights

checkpointer = FullModelHFCheckpointer(
    checkpoint_dir='/home/appy/checkpoints/Llama3.1-8B_oasst1_qat/epoch_0',
    checkpoint_files=[...]
    output_dir='/home/appy/checkpoints/Llama3.1-8B_oasst1_qat/epoch_0',
    model_type='LLAMA3'
)
sd = checkpointer.load_checkpoint()
sd = convert_weights.tune_to_meta(sd['model'])
torch.save(sd, "/home/appy/checkpoints/Llama3.1-8B_oasst1_qat/epoch_0/checkpoint.pth") 
```

**Proposal:** torchtune recipes should support different input and output formats in the same recipe. This would improve end-to-end UX for users who wish to interop with other frameworks when using torchtune, such that they will no longer have to perform an extra conversion step with the above code (the recipe will do this for the users). Providing better support for these end-to-end flows is also the direction torchao is moving towards in general.

Related torchtitan issue: https://github.com/pytorch/torchtitan/issues/1177
Related executorch issue: https://github.com/pytorch/executorch/issues/3303

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: support different input/output formats in the same recipe #2732

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: support different input/output formats in the same recipe #2732

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions