Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add require_triton and enable test_dynamo work on xpu #2878

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

faaany
Copy link
Contributor

@faaany faaany commented Jun 21, 2024

What does this PR do?

test_dynamo works on xpu, but fails because it requires extra installation of the triton library. This PR adds the require_triton test marker and make test_dynamo device-agnostic.

@faaany faaany marked this pull request as ready for review June 21, 2024 06:27
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this ! Can you have a second look @BenjaminBossan ?

@@ -190,15 +190,16 @@ def test_can_undo_fp16_conversion(self):
model = extract_model_from_parallel(model, keep_fp32_wrapper=False)
_ = pickle.dumps(model)

@require_cuda
@require_triton
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be fine but I would love to have your opinion @BenjaminBossan. triton should be installed when we install torch.

@SunMarc SunMarc requested a review from BenjaminBossan June 25, 2024 16:58
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I always thought that triton comes with torch. I just created a test env and ran pip install torch and indeed triton was installed. Under what circumstances can it be missing, is it dependent on the device?

@@ -190,15 +190,16 @@ def test_can_undo_fp16_conversion(self):
model = extract_model_from_parallel(model, keep_fp32_wrapper=False)
_ = pickle.dumps(model)

@require_cuda
@require_triton
@require_non_cpu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So by adding this, we enable this test on all non-CPU devices. Do we know that it passes on all of them? We don't cover non-CUDA on CI so we would not notice if we break it. Would it be better to explicitly check for XPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, but this might be an opportunity to let them know that they should support it. And I think the previous device-agnostic work in transformers (#25654) doesn't take into account whether the test case is supported yet or not. For devices that are not supported, they just add a manual skip. But I am not sure which way is the best to go for accelerate. Just let me know, I will update my code. :)

@faaany
Copy link
Contributor Author

faaany commented Jun 27, 2024

Hmm, I always thought that triton comes with torch. I just created a test env and ran pip install torch and indeed triton was installed. Under what circumstances can it be missing, is it dependent on the device?

yes, it depends on the device. For the pytorch cuda distribution, triton is installed by default. But for XPU and CPU, triton will not be installed by default (at least for now). So if we think about the pro&cons about adding this marker:

Pros:

  • larger device coverage
  • explicit information for the user that this test case requires triton

Cons:

  • more code in the codebase that is not used often (but I am not sure whether we will have more test case that needs pytoch dynamo)

@faaany
Copy link
Contributor Author

faaany commented Jun 27, 2024

Just let me know how you decide; I will update my code accordingly. Thanks so much for the review! @BenjaminBossan @SunMarc

@BenjaminBossan
Copy link
Member

Thanks for providing further information on the triton dependency. In that case, I agree that adding an explicit check is fine.

As to whether we should potentially break the tests for other non-CUDA devices: It's hard to say if this would be a good thing or not, as we don't know how that will affect the corresponding maintainers. Probably it's best for Zach to judge when he's back.

Copy link
Collaborator

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this solution seems fine. I'll make a note of it in the release notes today and we can keep an eye out for any issues/follow up PRs from folks that found this broke things :)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@muellerzr muellerzr merged commit 3a02754 into huggingface:main Jul 3, 2024
23 checks passed
@faaany faaany deleted the triton branch November 4, 2024 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants