[Bugfix] for saving quantized models trained using fsdp #2183

rahul-tuli · 2024-03-15T01:27:20Z

Bugfix for Saving Quantized Models Trained Using FSDP

This description details a bugfix for an issue encountered when loading quantized models that were trained using Fully Sharded Data Parallel (FSDP).

The issue originated from the process of quantization, during which we updated layer names and saved them using our custom implementation instead of accelerate. This approach resulted in an incorrectly saved state_dict, as each tensor had device information associated with it. Consequently, the model could not be loaded properly using the transformer's AutoModel.from_pretrained(...) method.

The modifications in this update address these complications, ensuring that quantized models trained using FSDP can now be saved and loaded correctly.

Changes

Asana Ticket
Relevant code modifications have been made to fix the saved state dicts for quantized models trained with FSDP.

The solution includes a post-processing step where we explicitly iterate through the state_dict and move each tensor to the CPU. The corrected state_dict is then overwritten on the previous, faulty state_dict.

Testing

The saved quantized models can now be loaded by SparseAutoModel.from_pretrained(...). This has been verified manually. The test commands in the ticket work as expected.

Benefits

This fix improves compatibility when saving quantized models trained using FSDP.

rahul-tuli force-pushed the potential-fsdp-quantized-model-save-fix branch from 8fe1701 to 65a4c07 Compare March 15, 2024 15:36

Potential fix for saving quantized models trained using fsdp

cc05f07

rahul-tuli force-pushed the potential-fsdp-quantized-model-save-fix branch from 65a4c07 to cc05f07 Compare March 18, 2024 15:05

rahul-tuli mentioned this pull request Mar 18, 2024

[Cherry pick] quantized fsdp model loading #2186

Merged

rahul-tuli marked this pull request as ready for review March 18, 2024 15:14

rahul-tuli requested review from Satrat, bfineran, dsikka, horheynm and dbogunowicz March 18, 2024 15:14

rahul-tuli self-assigned this Mar 18, 2024

rahul-tuli added the bug Something isn't working label Mar 18, 2024

rahul-tuli changed the title ~~Potential fix for saving quantized models trained using fsdp~~ [Bugfix] for saving quantized models trained using fsdp Mar 18, 2024

Satrat approved these changes Mar 18, 2024

View reviewed changes

bfineran approved these changes Mar 18, 2024

View reviewed changes

Satrat merged commit 1fd86c2 into main Mar 18, 2024
13 of 14 checks passed

Satrat deleted the potential-fsdp-quantized-model-save-fix branch March 18, 2024 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] for saving quantized models trained using fsdp #2183

[Bugfix] for saving quantized models trained using fsdp #2183

rahul-tuli commented Mar 15, 2024 •

edited

Loading

[Bugfix] for saving quantized models trained using fsdp #2183

[Bugfix] for saving quantized models trained using fsdp #2183

Conversation

rahul-tuli commented Mar 15, 2024 • edited Loading

Bugfix for Saving Quantized Models Trained Using FSDP

Changes

Testing

Benefits

rahul-tuli commented Mar 15, 2024 •

edited

Loading