Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PyTorch Upstream] Inductor UT Regression caused by Triton update: NotImplementedError: elapsed_time is not supported by XPUEvent. #1138

Open
etaf opened this issue May 16, 2024 · 9 comments · Fixed by #1190

Comments

@etaf
Copy link

etaf commented May 16, 2024

Hi, teams, we found an Inductor UT Regression caused by Triton update: NotImplementedError: elapsed_time is not supported by XPUEvent.
This is cause by the commit: 21bd536

The commit is ok, but unfortunately XPU runtime doesn't support it yet.

The detail:

File "/home/xinanlin/xinanlin/pytorch/torch/_dynamo/utils.py", line 210, in time_wrapper
   r = func(*args, **kwargs)
 File "/home/xinanlin/xinanlin/pytorch/torch/_inductor/runtime/triton_heuristics.py", line 687, in benchmark_all_configs
   timings = {
 File "/home/xinanlin/xinanlin/pytorch/torch/_inductor/runtime/triton_heuristics.py", line 688, in <dictcomp>
   launcher: self.bench(launcher, *args, **kwargs)
 File "/home/xinanlin/xinanlin/pytorch/torch/_inductor/runtime/triton_heuristics.py", line 659, in bench
   return do_bench_gpu(kernel_call, rep=40, fast_flush=True)
 File "/home/xinanlin/xinanlin/pytorch/torch/_inductor/runtime/runtime_utils.py", line 112, in do_bench_gpu
   return triton_do_bench(*args, **kwargs)[0]
 File "/home/xinanlin/xinanlin/miniconda3/lib/python3.10/site-packages/triton/testing.py", line 132, in do_bench
   estimate_ms = start_event.elapsed_time(end_event) / 5
 File "/home/xinanlin/xinanlin/pytorch/torch/xpu/streams.py", line 152, in elapsed_time
   return super().elapsed_time(end_event)
NotImplementedError: elapsed_time is not supported by XPUEvent.
@etaf
Copy link
Author

etaf commented May 16, 2024

@vlad-penkin @riverliuintel This regression blocked our stock pytorch Inductor UT upstream, please pritorize, thanks.

@etaf
Copy link
Author

etaf commented May 16, 2024

From intel pytorch core team, the "elapsed_time is not supported by XPUEvent." will not be supported untill 2025 Q1, that's too late. Maybe we should revert the commit.

@etiotto
Copy link
Contributor

etiotto commented May 16, 2024

This line of code:

 estimate_ms = start_event.elapsed_time(end_event) / 5

Is identical to the OpenAI version of do_bench. We should use torch.xpu.Event to compute the kernel time (just like for NV they use torch.cuda.Event. The code works fine in our CI when we use IPEX intel-extension-for-pytorch 2.1.10+gitfd4fce2 so I suggest upstreaming support in pytorch.

@etaf
Copy link
Author

etaf commented May 16, 2024

@guangyey may I know if you can upstream the feature by the end of this week?

@alexbaden
Copy link
Contributor

I have pushed a PR for an initial implementation based on IPEX implementation to PyTorch: pytorch/pytorch#126456

@etaf
Copy link
Author

etaf commented May 22, 2024

Hi, @etiotto @alexbaden @vlad-penkin , We'll skip the affected cases untill XPU Runtime supports elapsed_time in XPUEvent.
Thanks for your support!

@etaf etaf closed this as completed May 22, 2024
@vlad-penkin vlad-penkin reopened this May 22, 2024
@vlad-penkin
Copy link
Contributor

@etaf @riverliuintel @EikanWang this is a high priority change for Triton, reopening the issue

@riverliuintel
Copy link

riverliuintel commented May 22, 2024

The current implementation in Triton XPU backend is a regression issue comparing with current Triton version upstream in stock PyTorch. It not only make Inductor UT failure, but also broken the Inductor XPU functionality, as the Inductor auto_tune is an essential feature of the Inductor for PyTorch v2.4 XPU release. We request to revert to original Trion implementation to no miss the PyTorch v2.4 intel GPU feature freeze date before the end of May.

@Stonepia
Copy link
Contributor

Stonepia commented Jun 7, 2024

@vlad-penkin Hello Vlad, in IPEX 2.3, the XPUEvent elapsed_time is removed. Could you remove the changes from https://github.com/intel/intel-xpu-backend-for-triton/pull/1190/files#diff-447aaa7319f4083423e0d1ce5ecbf440e7efe60b5bc527b38f82e982280b98d3R9-R16 ? We no longer need this if statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants