Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Initialize DoRA weights in float32 if float16 is being used #1653

Merged

Conversation

BenjaminBossan
Copy link
Member

When DoRA weights are initialized in float16 on CPU and when an older PyTorch version is being used (<2.2), there is an error because the the operation is not supported for float16 on CPU. This commit temporarily converts the LoRA weights to float32 beforehand if they're in float16.

Of course, when the user tries to train or predict with this model on CPU, they will still encounter errors. However, in certain situations, only the initialization might be on CPU and later it is moved to GPU. This could be some framework code that the user has no control over, as in #1597. Therefore, it's good to have this safety hatch.

Note that since our CI uses the latest PyTorch version, we cannot run a test for this, as the latest PyTorch runs no matter what.

When DoRA weights are initialized in float16 on CPU and when an older
PyTorch version is being used (<2.2), there is an error because the the
operation is not supported for float16 on CPU. This commit temporarily
converts the LoRA weights to float32 beforehand if they're in float16.

Of course, when the user tries to train or predict with this model on
CPU, they will still encounter errors. However, in certain situations,
only the initialization might be on CPU and later it is moved to GPU.
This could be some framework code that the user has no control over, as
in huggingface#1597. Therefore, it's good to have this safety hatch.

Note that since our CI uses the latest PyTorch version, we cannot run a
test for this, as the latest PyTorch runs no matter what.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks !

@BenjaminBossan BenjaminBossan merged commit e7b47ac into huggingface:main Apr 29, 2024
14 checks passed
@BenjaminBossan BenjaminBossan deleted the fix-dora-init-with-float16 branch April 29, 2024 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants