-
Notifications
You must be signed in to change notification settings - Fork 26.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Inconsistency in PreTrainedModel.resize_token_embeddings When ZeRO3 I…
…s Enabled (#25394) * Inconsistency in PreTrainedModel.resize_token_embeddings This PR addresses #25241. In previous implementation when ZeRO stage 3 was enbaled, resize_token_embeddings would create independent PyTorch weights on each device. Here we ensure that new embeddings are created with DeepSpeed init, and are properly partitioned accros devices. * formatting with black * adding the removed comments back in --------- Co-authored-by: Sina Moeini <smoeini@amazon.com>
- Loading branch information
1 parent
b4d5548
commit 9264fc9
Showing
1 changed file
with
51 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters