Skip to content
This repository was archived by the owner on Mar 28, 2023. It is now read-only.

[SYCL] Add test for sorting over sub-group #1380

Merged

Conversation

romanovvlad
Copy link

While extending the original test did some refactoring:

  1. Reduced types it tests
  2. Test only 1 and 2 dimensions

While extending the original test did some refactoring:
1. Reduced types it tests
2. Test only 1 and 2 dimensions
Copy link

@andreyfe1 andreyfe1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big thanks for making such changes! I very appreciate it!
Any plans to test 3 dimensions?

@andreyfe1
Copy link

LGTM. Make sure that CI is passed

@romanovvlad
Copy link
Author

The fail KernelAndProgram/kernel-bundle-merge-options-env.cpp is unrelated. Merging.

@romanovvlad romanovvlad merged commit 70a12b5 into intel:intel Nov 17, 2022
@npmiller
Copy link

@romanovvlad this is failing on CUDA and HIP, they don't support intel::reqd_sub_group_size so they end up running the test with the wrong sub group size which leads to incorrect results, setting ReqSubGroupSize to 32 for CUDA I got the test to pass locally, maybe we should query the device for the sub-group size instead of imposing it?

@romanovvlad
Copy link
Author

@romanovvlad this is failing on CUDA and HIP, they don't support intel::reqd_sub_group_size so they end up running the test with the wrong sub group size which leads to incorrect results, setting ReqSubGroupSize to 32 for CUDA I got the test to pass locally, maybe we should query the device for the sub-group size instead of imposing it?

Cannot find an API for querying sub-groups size for a given kernel with a given work-group size. Am I missing something?

@npmiller
Copy link

I don't think there's a way to query for that, I think either we need to not specify it, let the device pick for us and use get_sub_group(), in the kernel.

Or there should be a way to query what the device support, info::device::sub_group_sizes maybe, I'm not sure it's properly implemented.

@romanovvlad
Copy link
Author

I don't think there's a way to query for that, I think either we need to not specify it, let the device pick for us and use get_sub_group(), in the kernel.

This won't help as I need to know size of the sub-group before running the kernel to understand how much additional memory required.

Should we disable the test for CUDA and HIP until we find a solution?

@npmiller
Copy link

Should we disable the test for CUDA and HIP until we find a solution?

Yeah we can do that

myler pushed a commit to myler/llvm-test-suite that referenced this pull request Mar 22, 2023
While extending the original test did some refactoring:
1. Reduced types it tests
2. Test only 1 and 2 dimensions
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
While extending the original test did some refactoring:
1. Reduced types it tests
2. Test only 1 and 2 dimensions
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants