Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Run the LIT tests using the selected backend #1288

Closed
wants to merge 1 commit into from

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Mar 11, 2020

sycl/test/CMakeLists.txt requests which backend to test by passing to the LIT configuration either

PARAMS "SYCL_BE=PI_OPENCL"

or

PARAMS "SYCL_BE=PI_CUDA"

This change passes the backend choice to the test through its environment.

@@ -85,7 +85,8 @@
print("Adding path to opencl-aot tool to PATH")
os.environ['PATH'] = os.path.pathsep.join((os.getenv('PATH'), config.sycl_tools_dir))

backend=lit_config.params.get('SYCL_BE', "PI_OPENCL")
backend = lit_config.params.get('SYCL_BE', "PI_OPENCL")
config.environment['SYCL_BE'] = backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea!
Please remove gpu_run_on_linux_substitute += " SYCL_BE=PI_CUDA " below.

I have a similar but less elegant PR in the works that I will adapt to your change.

bjoernknafla
bjoernknafla previously approved these changes Mar 12, 2020
Fznamznon
Fznamznon previously approved these changes Mar 13, 2020
@bader
Copy link
Contributor

bader commented Mar 13, 2020

It looks like SYCL_BE doesn't play well with SYCL_DEVICE_TYPE=HOST.

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 13, 2020

It looks like SYCL_BE doesn't play well with SYCL_DEVICE_TYPE=HOST.

It looks like setting both SYCL_DEVICE_TYPE=HOST and SYCL_BE=PI_OPENCL and querying all devices is fine:

$ SYCL_DEVICE_TYPE=HOST SYCL_BE=PI_OPENCL ./simple-sycl-app all
Available SYCL platforms:
  - SYCL host platform, driver version 1.2
    - SYCL host device, driver version 1.2

Running on SYCL device SYCL host device, driver version 1.2
The results are correct!

While setting both SYCL_DEVICE_TYPE=HOST and SYCL_BE=PI_OPENCL and trying to use the default selector does not work:

$ SYCL_DEVICE_TYPE=HOST SYCL_BE=PI_OPENCL ./simple-sycl-app default
Available SYCL platforms:
  - SYCL host platform, driver version 1.2
    - SYCL host device, driver version 1.2

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  No device of requested type available. -1 (CL_DEVICE_NOT_FOUND)
Aborted (core dumped)

It kinds of make sense, and it's the same behaviour I get as when there are no OpenCL devices:

$ OCL_ICD_VENDORS= SYCL_BE=PI_OPENCL ./simple-sycl-app default
Available SYCL platforms:
  - NVIDIA CUDA, driver version CUDA 10.2
    - Tesla K40c, driver version CUDA 10.2

  - SYCL host platform, driver version 1.2
    - SYCL host device, driver version 1.2

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  No device of requested type available. -1 (CL_DEVICE_NOT_FOUND)
Aborted (core dumped)

What do you think should be the behaviour in these cases ?
Disregard the value of SYCL_BE and fall back to the host device ?

sycl/test/CMakeLists.txt requests which backend to test by passing to
the LIT configuration either

    PARAMS "SYCL_BE=PI_OPENCL"

or

    PARAMS "SYCL_BE=PI_CUDA"

This change passes the backend choice to the test throught its
environment.

Signed-off-by: Andrea Bocci <andrea.bocci@cern.ch>
@fwyzard fwyzard dismissed stale reviews from Fznamznon and bjoernknafla via 62913d1 March 13, 2020 21:44
@fwyzard fwyzard force-pushed the fix_check-sycl-cuda branch 2 times, most recently from 62913d1 to f856948 Compare March 13, 2020 21:45
@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 13, 2020

Please remove gpu_run_on_linux_substitute += " SYCL_BE=PI_CUDA " below.

Done, and rebased.

fwyzard added a commit to cms-patatrack/llvm that referenced this pull request Mar 21, 2020
@bader bader requested review from garimagu and rbegam April 2, 2020 15:29
@bader
Copy link
Contributor

bader commented Apr 2, 2020

@fwyzard, unfortunately, there are some unexpected failures in LIT tests with this patch.
I'm not sure what is the reason, but some tests start failing on the host.
Not sure why, but it might be some bug in the implementation.

@bader bader requested a review from romanovvlad April 2, 2020 15:36
@bader bader self-assigned this Apr 2, 2020
@bader
Copy link
Contributor

bader commented Apr 10, 2020

I think this functionality is covered by #1458. @vladimirlaz, could you confirm, please?

@bader
Copy link
Contributor

bader commented Apr 10, 2020

Probably #1300 is also needed.

@bjoernknafla
Copy link
Contributor

This PR moves environment variables into config.environment. I took a different approach in #1300 to make more env vars visible in the run line when a LIT test fails to make it easier to debug the test case by hand. In a sense #1300 replaces this PR.

I am still fond of the elegant approach here to build up an environment but wonder how we can make the environment visible when a LIT test case fails to understand how to run and debug it by hand?

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 10, 2020

how we can make the environment visible when a LIT test case fails to understand how to run and debug it by hand?

Mhm, yes, that's a very good point.

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 13, 2020

how we can make the environment visible when a LIT test case fails to understand how to run and debug it by hand?

how about writing all environment variables for test test to a file test.env, and running the test with env $(< test.env) ./test ?

@bjoernknafla
Copy link
Contributor

Very interesting idea! How many of these environments would we need? My guess is 4 envs correpsonding to:

  • cpu_run_substitute
  • cpu_run_on_linux_substitute (currently always identical with the above)
  • gpu_run_substitute
  • gpu_run_on_linux_substitute (currently always identical with the above)

My hope is that the file opening overhead would not show up in the test timings (which are dominated by compile times anyway).

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 15, 2020

OK, let me see ...

What does substitute stand for in cpu_run_substitute, etc. ?

The tests that are not _on_linux, what are they supposed to target ?

Should there be two set of tests, for CUDA and for OpenCL ?

My hope is that the file opening overhead would not show up in the test timings (which are dominated by compile times anyway).

From a quick test it seems to add 2-3 ms of overhead.

@bjoernknafla
Copy link
Contributor

bjoernknafla commented Apr 15, 2020

What does substitute stand for in cpu_run_substitute, etc. ?

These are the commands we use inside the LIT tests to run the build executables. They are set up in sycl/test/lit.cfg.py, e.g.:

gpu_run_substitute = " env SYCL_DEVICE_TYPE=GPU SYCL_BE={SYCL_BE} ".format(SYCL_BE=backend)

... and then set up like config.substitutions.append( ('%GPU_RUN_PLACEHOLDER', gpu_run_substitute) ) to be accessible in LIT tests, e.g. in basic_tests/boolean.cpp to run the test executable with the set up environment:

// RUN: %GPU_RUN_PLACEHOLDER %t.out

The tests that are not _on_linux, what are they supposed to target ?

I am not sure what the difference between the Linux and non-Linux set up is - at the moment they seem to always match and I cannot find a single use of the Linux ones. Might just be an option to allow differentiation if required?

Should there be two set of tests, for CUDA and for OpenCL ?

Looking at the different substitutions in sycl/test/lit.cfg.py we create the environments very dynamically. I wonder if we can catch that in an env file if we want to enable independent LIT runs for different cases at the same time to not affect each other (though I am not sure if that wouldn't be a problem with LIT anyway)?

@vladimirlaz
Copy link
Contributor

@fwyzard does this PR is still needed? I guess we can close it as the alike changes were implemented in #1458

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 30, 2020

The latest definition of the tests works well enough, and passing the variables explicitly on the command line instead of implicitly via the environment makes it easier to re-run the tests by hand.

@fwyzard fwyzard closed this Apr 30, 2020
@fwyzard fwyzard deleted the fix_check-sycl-cuda branch April 30, 2020 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants