Skip to content

Conversation

@SunMarc
Copy link
Member

@SunMarc SunMarc commented Aug 12, 2024

What does this PR do ?

This PR updates the minimum version required for accelerate to 0.26.0.

Fixes #31449

Creating a dev docker, waiting for the tests to pass.

@SunMarc SunMarc requested a review from amyeroberts August 12, 2024 14:04
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

LGTM - let's have @ydshieh confirm this is all that's necessary, I'm still a bit out of touch with the official process for updating the docker images

Copy link
Contributor

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Agreed with @ydshieh check, though we should also have a check inside the TrainingArguments for this min version and raise an error in the __post_init__

@SunMarc
Copy link
Member Author

SunMarc commented Aug 12, 2024

we should also have a check inside the TrainingArguments for this min version and raise an error in the post_init

Oh that's true that we don't have any check for accelerate inside TrainingArguments. Is accelerate a required dep for using Trainer ?

@muellerzr
Copy link
Contributor

@SunMarc yes it is (the pytorch version). You can see some similar checks for accelerate in the training arguments. This should go in during def _setup_devices

@SunMarc
Copy link
Member Author

SunMarc commented Aug 12, 2024

@SunMarc yes it is (the pytorch version). You can see some similar checks for accelerate in the training arguments. This should go in during def _setup_devices

Thanks ! This should be fixed in the latest commit. Also I removed a few checks that aren't needed anymore.

Comment on lines +1916 to +1917
os.environ[f"{prefix}SHARDING_STRATEGY"] = str(
FSDP_SHARDING_STRATEGY.index(fsdp_option.upper()) + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I can't wait until these hacks aren't needed anymore :)

Copy link
Contributor

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I think the failures come from main? (they seem unrelated to me)

@SunMarc
Copy link
Member Author

SunMarc commented Aug 12, 2024

Failing tests are not related to my PR. See the linked PR above. Other than that, tests are passing ! I'll wait until @ydshieh approves to merge this PR !

@SunMarc SunMarc requested a review from ydshieh August 12, 2024 16:40
@molbap
Copy link
Contributor

molbap commented Aug 13, 2024

Failures are indeed unrelated, you can rebase on #32649 to skip tests that are currently failing and it should work (or wait till it's merged on main)

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this update.

If IIRC, you checked the docker images for CircleCI jobs and the related tests.
Don't forget to check the images used for daily CI and the related jobs too.

@SunMarc
Copy link
Member Author

SunMarc commented Aug 20, 2024

Don't forget to check the images used for daily CI and the related jobs too.

The daily CI is using the most recent version of accelerate, so that should be fine !

@SunMarc SunMarc merged commit fd06ad5 into main Aug 20, 2024
@SunMarc SunMarc deleted the update-min-accelerate branch August 20, 2024 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants