Skip to content

[SYCL][ROCm] Setup lit tests for ROCm plugin #4163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 20, 2021
Merged

Conversation

npmiller
Copy link
Contributor

This patch wires up the lit tests for the ROCm plugin adding check-sycl-rocm.

check-sycl-rocm includes check-sycl-rocm-on-device and check-sycl-rocm-ptx to run the compiler tests with the NVIDIA triple if using the NVIDIA platform for ROCm or check-sycl-rocm-gcn if using the AMD platform.

This PR is marked as draft is it requires a bit more work to get check-sycl-rocm to work properly.

On AMD it should pass after these two patches are merged/fixed:

On NVIDIA it requires a bit more test filtering and/or debugging because of issues in the HIP APIs, #4123 is one of them but there is a few other bugs that need to be addressed or filtered before this PR can be merged.

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!


if(SYCL_BUILD_PI_ROCM)
add_custom_target(check-sycl-rocm)
if("${SYCL_BUILD_PI_ROCM_PLATFORM}" STREQUAL "NVIDIA")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor comment.

See the discussion in #4069. In case we need to support other spellings for SYCL_BUILD_PI_ROCM_PLATFORM, we might need to make sure it's all capital.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's probably a good idea, however there's already a bunch of other places that do this comparison. So I'd suggest to keep it like that in this patch and then once it's merged I can put up a new PR updating all of the uses of this to be case-insensitive.

@npmiller
Copy link
Contributor Author

I've marked the remaining two failing tests with NVIDIA as XFAIL, so check-sycl-rocm should now pass on NVIDIA platform with this branch.

@npmiller npmiller marked this pull request as ready for review August 3, 2021 10:59
@npmiller npmiller requested a review from sergey-semenov August 3, 2021 10:59
@bader
Copy link
Contributor

bader commented Aug 3, 2021

@npmiller, could you resolve merge conflicts, please? It looks like a number of tests were just removed - 7735139.

@npmiller
Copy link
Contributor Author

npmiller commented Aug 3, 2021

@npmiller, could you resolve merge conflicts, please? It looks like a number of tests were just removed - 7735139.

Rebased and all conflicts resolved

bader
bader previously approved these changes Aug 3, 2021
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although there is some contradiction between this and another your patch - #4239. I think they suggest using different options for setting target architecture.

@npmiller
Copy link
Contributor Author

npmiller commented Aug 3, 2021

LGTM, although there is some contradiction between this and another your patch - #4239. I think they suggest using different options for setting target architecture.

Yes there is, because #4239 changes the option that should be used. The PRs are independent at the moment so either one can go in first but I'll have to update the other to resolve conflict and/or update the new changes.

@bader bader added the hip Issues related to execution on HIP backend. label Aug 4, 2021
vladimirlaz
vladimirlaz previously approved these changes Aug 10, 2021
@bader
Copy link
Contributor

bader commented Aug 19, 2021

@DenisBakhvalov, @kbobrovs, @kychendev, ping.

DenisBakhvalov
DenisBakhvalov previously approved these changes Aug 19, 2021
Copy link
Contributor

@DenisBakhvalov DenisBakhvalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in sycl/test/esimd/odr.cpp look good.

@kychendev
Copy link
Contributor

Changes in sycl/test/esimd/odr.cpp look good.

LGTM

@bader
Copy link
Contributor

bader commented Aug 19, 2021

On NVIDIA it requires a bit more test filtering and/or debugging because of issues in the HIP APIs, #4123 is one of them but there is a few other bugs that need to be addressed or filtered before this PR can be merged.

We are not doing ROCm plug-in validation on NVIDIA cards, so I'm not sure if it make sense to block the merge of this pull request by #4123.
@npmiller, are you okay to merge this PR before #4123?

@npmiller
Copy link
Contributor Author

On NVIDIA it requires a bit more test filtering and/or debugging because of issues in the HIP APIs, #4123 is one of them but there is a few other bugs that need to be addressed or filtered before this PR can be merged.

We are not doing ROCm plug-in validation on NVIDIA cards, so I'm not sure if it make sense to block the merge of this pull request by #4123.
@npmiller, are you okay to merge this PR before #4123?

Yes, the tests addressed in #4123 are marked as XFAIL here, so this MR should pass check-sycl-rocm on NVidia too at the moment.

I do need to change the -mcpu to --offload-arch now that #4239 is merged.
I'll have a patch to do that tomorrow. If that's easier to review we could merge this and have that patch in a follow up PR, this will mostly work without it, lit will just be a bit annoying as it looks for -mcpu but specifying both flags should work.

@bader
Copy link
Contributor

bader commented Aug 20, 2021

I do need to change the -mcpu to --offload-arch now that #4239 is merged.
I'll have a patch to do that tomorrow. If that's easier to review we could merge this and have that patch in a follow up PR, this will mostly work without it, lit will just be a bit annoying as it looks for -mcpu but specifying both flags should work.

If merging this patch is not urgent, let's make this change here. There should be no difference in where such change is reviewed - within this PR or separately.

@npmiller npmiller requested a review from kychendev as a code owner August 20, 2021 12:40
@npmiller
Copy link
Contributor Author

Rebased and updated the uses of -mcpu to use the new --offload-arch

@bader
Copy link
Contributor

bader commented Aug 20, 2021

@v-klochkov, it looks like failure in reduction test on Windows is not related to the patch. Please, take a look.
@pvchupin, FYI.

@bader bader requested a review from vladimirlaz August 20, 2021 16:52
@bader bader merged commit a04da75 into intel:sycl Aug 20, 2021
@v-klochkov
Copy link
Contributor

v-klochkov commented Aug 20, 2021

@v-klochkov, it looks like failure in reduction test on Windows is not related to the patch. Please, take a look.

Fix for timeout on ACC:
intel/llvm-test-suite#414

bader pushed a commit that referenced this pull request Nov 26, 2021
…ut (#5032)

A follow up to #4163 (comment)

`buildbot` script only allows `NVIDIA` capitalise the variable in case user spelled it differently.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hip Issues related to execution on HIP backend.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants