Skip to content

[SYCL] Introduce XPTI-based tooling for SYCL applications #5389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 5, 2022

Conversation

alexbatashev
Copy link
Contributor

@alexbatashev alexbatashev commented Jan 25, 2022

sycl-trace

Tracing tool that prints SYCL PI calls, analog to SYCL_PI_TRACE env variable.

Limitations:

  • No support for plugin load diagnostics
  • No support for device selector diagnostics
  • No support for graph tracing

sycl-prof

Dumps profiling information from SYCL runtime and saves it to Chrome-compatible JSON file

Limitations:

  • No support for "summary" mode.

sycl-sanitize

A CLI tool to facilitate development of DPC++ applications. Currently sycl-sanitizer is capable of the following diagnostics:

  • USM pointer memory leaks. Report is generated at the end of program execution and contains location of leaked pointer malloc call.
  • Construction of SYCL buffers from device or shared USM pointer. The tool will abort execution of application if it attempts to create a buffer from device or shared USM pointer.
  • Construction of SYCL buffer from host USM pointer of smaller size, than buffer range. The application will be terminated.

Common limitations to all tools:

  • All tools are only available on Linux

sycl-trace    - tracing tool that prints SYCL PI calls, analog
                to SYCL_PI_TRACE env variable
sycl-prof     - dumps profiling information from SYCL runtime
                and saves it to Chrome-compatible JSON file
sycl-sanitize - provides some diagnostics based on SYCL PI calls.
                Currently this tool can diagnose USM memory leaks
                and abuse of USM pointers in SYCL buffers
@alexbatashev
Copy link
Contributor Author

Pending on #5391

@alexbatashev alexbatashev marked this pull request as ready for review February 7, 2022 15:59
@alexbatashev alexbatashev requested a review from a team as a code owner February 7, 2022 15:59
vladimirlaz
vladimirlaz previously approved these changes Feb 21, 2022
Copy link
Contributor

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot wait using all these new tools!

#include <thread>
#include <unistd.h>

unsigned long process_id() { return static_cast<unsigned long>(getpid()); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unsigned long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is very Linux-specific. I'll have to come up with a Windows variant, though.

xpti::trace_event_data_t *,
uint64_t /*Instance*/,
const void *UserData) {
unsigned long TID = std::hash<std::thread::id>{}(std::this_thread::get_id());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not what is standard https://en.cppreference.com/w/cpp/utility/hash instead of unsigned long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standard type is size_t, made a change

public:
explicit JSONWriter(const std::string &OutPath) : MOutFile(OutPath) {}

void init() final {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you worried by your neighbors with these final? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choosing between override and final I figured final is more suitable here...

const char * /*version_str*/,
const char *StreamName) {
if (std::string_view(StreamName) == "sycl.pi.debug") {
GS = new GlobalState;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need some raw memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Some SYCL users use SYCL APIs in global scope, and to work correctly, GlobalState has to outlive everything else in the application. Pointer is trivially destructible, so it'll be alive as long as process is alive. And it'll allow us to capture all the calls.


Args.push_back(nullptr);

int Err = launch(TargetExecutable.c_str(), Args, NewEnv);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps launch should have an STL friendly interface instead of having all these raw pointers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it look better now?

PS Perhaps STL should have a process launch function?


Args.push_back(nullptr);

int Err = launch(TargetExecutable.c_str(), Args, NewEnv);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps LLVM library comes with such a function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not, unfortunately... Would be great to have one, though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ralender you told me about something in this area.

* upstream/sycl: (2757 commits)
  [SYCL][Doc] Fixing incorrect merge of community Readme.md with our version (intel#5636)
  [SYCL] Change USM pooling parameters. (intel#5457)
  [CI] Fix cache location on Windows (intel#5603)
  [SYCL][NFC] Fix a warning about uninitialized struct members (intel#5610)
  [Buildbot] Update Windows GPU version to 101.1340 (intel#5620)
  Fix SPIRV -> OCL barrier call argument attributes
  Move SPV_INTEL_memory_access_aliasing tokens from spirv_internal
  [SYCL][ESIMD] Add support for named barrier APIs (intel#5583)
  [SYCL][L0] Remove ZeModule when program build failed (intel#5541)
  [SYCL] Silence "unknown attribute" warning for `device_indirectly_callable` (intel#5591)
  [SYCL][DOC] Introductory material for extensions (intel#5605)
  [SYCL][DOC] Change extension names to lower case (intel#5607)
  [SYCL] Improve get_kernel_bundle performance (intel#5496)
  [SYCL] Do not build device code for sub-devices (intel#5240)
  [sycl-post-link] Fix a crash during spec-constant properties generation (intel#5538)
  [SYCL][DOC] Move SPIR-V and OpenCL extensions (intel#5578)
  [SYCL][ESIMD][EMU] Update memory intrinsics for ESIMD_EMU plugin (intel#4748)
  [CI] Allow stale issue bot to analyze more issues (intel#5602)
  [SYCL][L0] Honor property::queue::enable_profiling (intel#5543)
  [OpenMP] Properly save strings when doing LTO
  ...
@bader
Copy link
Contributor

bader commented Mar 1, 2022

@keryell, any objections to merge this PR?

@bader bader merged commit 789d138 into intel:sycl Mar 5, 2022
@bader
Copy link
Contributor

bader commented Mar 5, 2022

@keryell, any objections to merge this PR?

Any post-commit comments are still welcome.

@bader
Copy link
Contributor

bader commented Mar 5, 2022

@keryell
Copy link
Contributor

keryell commented Mar 5, 2022

Any post-commit comments are still welcome.

Sure! :-)
But I am fine with this PR. Sorry for the feedback delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants