Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FA][Upstream PT] XPU out of memory raised by FA kernel with upstream pytorch #2042

Open
ESI-SYD opened this issue Aug 29, 2024 · 2 comments

Comments

@ESI-SYD
Copy link
Contributor

ESI-SYD commented Aug 29, 2024

flash attention benchmark fails with changes to use upstream pytorch.

It should be a torch issue.

Traceback (most recent call last):
  File "/runner/_work/intel-xpu-backend-for-triton/intel-xpu-backend-for-triton/benchmarks/key_benchmarks/flash_attention_fwd_benchmark.py", line 245, in <module>
    benchmark.run(show_plots=False, print_data=True)
  File "/runner/_work/intel-xpu-backend-for-triton/intel-xpu-backend-for-triton/benchmarks/key_benchmarks/triton_kernels_benchmark/benchmark_testing.py", line 249, in run
    result_dfs.append(self._run(bench, save_path, show_plots, print_data, **kwargs))
  File "/runner/_work/intel-xpu-backend-for-triton/intel-xpu-backend-for-triton/benchmarks/key_benchmarks/triton_kernels_benchmark/benchmark_testing.py", line 179, in _run
    ret = self.fn(**x_args, **{bench.line_arg: y}, **bench.args, **kwrags)
  File "/runner/_work/intel-xpu-backend-for-triton/intel-xpu-backend-for-triton/benchmarks/key_benchmarks/flash_attention_fwd_benchmark.py", line 228, in benchmark
    benchmark_suit.assert_close(triton_fn(), torch_fn(), atol=atol, rtol=1e-3, err_msg="triton to torch")
  File "/runner/_work/intel-xpu-backend-for-triton/intel-xpu-backend-for-triton/benchmarks/key_benchmarks/flash_attention_fwd_benchmark.py", line 225, in <lambda>
    torch_fn = lambda: torch.nn.functional.scaled_dot_product_attention(
RuntimeError: XPU out of memory, please use `empty_cache` to release all unoccupied cached memory.

CI:
https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/10609254853/job/29404643614

Repro:
use this poc branch feature/deprecate_benchmark_ipex

scripts/compile-triton.sh --venv
source .venv/bin/activate
scripts/test-triton.sh --attention

Related:
#1905

@ESI-SYD
Copy link
Contributor Author

ESI-SYD commented Sep 6, 2024

torch.xpu.empty_cache not helps: tracked in pytorch/pytorch#135085

@vlad-penkin vlad-penkin assigned riverliuintel and unassigned ESI-SYD Sep 9, 2024
@anmyachev
Copy link
Contributor

Probably pytorch/pytorch#135818 relates to this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants