[CUDA][HIP] Add Event Caching #1538

MartinWehking · 2024-04-22T15:11:18Z

Don't destroy events that are not needed anymore, but put them on a stack
of its associated queue (If there is one).

Use events from the stack before creating new ones.

Make the operations for putting events onto the stack and retrieving them
atomic to ensure thread safety.

This caching mechanism ensures that the number of CUDA API calls gets
significantly reduced for event creations and destructions.

source/adapters/cuda/event.cpp

source/adapters/cuda/event.hpp

source/adapters/cuda/event.cpp

source/adapters/hip/event.cpp

Don't destroy events that are not needed anymore, but put them on a stack of its associated queue (If there is one). Use events from the stack before creating new ones. This caching mechanism ensures that the number of CUDA API calls gets significantly reduced for event creations and destructions.

Follow a similar logic as in #13f097c. Reduce the number of HIP API calls for event creations and destructions.

Make the retrieval and pushing of events onto the stack thread-safe by locking it with mutexes. Use these mutexes inside the queue, which is responsible for caching the events. Apply further minor changes, such as returning the result status for event releases directly, usage of the proper UR assertions and proper styling for variable names.

Copy thread-safety mechanism for event caching from CUDA to HIP. (#590897e9103c06ef1f1d86a6edf7c7eb525bd7d8) Use atomic deletion of cached events in queue destructor. Refactor minor styling/redundancy issues for the CUDA and HIP adapter.

Use a mutex to make the emptiness check and retrieval of cached events atomic.

Return UR_RESULT_ERROR_INVALID_EVENT only at the end of a try catch block and UR_RESULT_SUCCESS as default.

source/adapters/cuda/event.cpp

Modify the return value inside urEventRelease.

Don't use a lock_guard inside another mutex locking function. This led to deadlocks for the CUDA and HIP adapters.

hdelan · 2024-05-15T11:06:32Z

source/adapters/cuda/queue.hpp

@@ -57,6 +61,8 @@ struct ur_queue_handle_t_ {
  std::mutex ComputeStreamMutex;
  std::mutex TransferStreamMutex;
  std::mutex BarrierMutex;
+  // The event cache might be accessed in multiple threads.
+  std::mutex CacheMutex;


I might be inclined to have this close in code to the CachedEvents

hdelan · 2024-05-15T11:08:31Z

Small nit. Otherwise LGTM

MartinWehking · 2024-05-15T13:11:58Z

source/adapters/cuda/queue.hpp

+  // Returns and removes an event from the CachedEvents stack.
+  ur_event_handle_t get_cached_event() {
+    std::lock_guard<std::mutex> CacheGuard(CacheMutex);
+    assert(!CachedEvents.empty());


@konradkusiak97 spotted this. Need to change this to detail::ur::assertion

MartinWehking · 2024-05-15T13:13:02Z

source/adapters/hip/queue.hpp

+  // Returns and removes an event from the CachedEvents stack.
+  ur_event_handle_t get_cached_event() {
+    std::lock_guard<std::mutex> CacheGuard(CacheMutex);
+    assert(!CachedEvents.empty());


@konradkusiak97 spotted this. Need to change this to detail::ur::assertion

MartinWehking · 2024-05-21T09:32:17Z

Closed since it did not yield significant improvement in benchmarking

github-actions bot added cuda CUDA adapter specific issues hip HIP adapter specific issues labels Apr 22, 2024

hdelan reviewed Apr 26, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

GeorgeWeb reviewed Apr 30, 2024

View reviewed changes

source/adapters/cuda/event.hpp Show resolved Hide resolved

MartinWehking force-pushed the event_cuda_amd_reuse branch from 2e18c74 to c19b5ac Compare May 8, 2024 09:18

MartinWehking mentioned this pull request May 8, 2024

AMD CUDA Event Reuse intel/llvm#13697

Closed

hdelan reviewed May 9, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 9, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan suggested changes May 9, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 9, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 9, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 10, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

MartinWehking force-pushed the event_cuda_amd_reuse branch from c3879cf to 590897e Compare May 10, 2024 09:17

hdelan reviewed May 10, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 10, 2024

View reviewed changes

source/adapters/hip/event.cpp Outdated Show resolved Hide resolved

MartinWehking force-pushed the event_cuda_amd_reuse branch from 97e39dc to ed75dc7 Compare May 10, 2024 14:44

Martin Wehking added 5 commits May 13, 2024 12:18

Cache events instead of destroying them for AMD

4cb82d9

Follow a similar logic as in #13f097c. Reduce the number of HIP API calls for event creations and destructions.

Copy thread-safety event caching changes for HIP

af8eed4

Copy thread-safety mechanism for event caching from CUDA to HIP. (#590897e9103c06ef1f1d86a6edf7c7eb525bd7d8) Use atomic deletion of cached events in queue destructor. Refactor minor styling/redundancy issues for the CUDA and HIP adapter.

Fix thread-safe destruction of cached events

ffa30c7

Use a mutex to make the emptiness check and retrieval of cached events atomic.

MartinWehking force-pushed the event_cuda_amd_reuse branch from 0938386 to ffa30c7 Compare May 13, 2024 11:30

Fix return vals for event release

4552eb0

Return UR_RESULT_ERROR_INVALID_EVENT only at the end of a try catch block and UR_RESULT_SUCCESS as default.

hdelan reviewed May 13, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 13, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

hdelan reviewed May 13, 2024

View reviewed changes

source/adapters/cuda/event.cpp Outdated Show resolved Hide resolved

Martin Wehking added 2 commits May 13, 2024 15:36

Return ur_result_t error directly in catch block

852d02f

Modify the return value inside urEventRelease.

Don't call mutex locking function

30a6602

Don't use a lock_guard inside another mutex locking function. This led to deadlocks for the CUDA and HIP adapters.

MartinWehking changed the title ~~Event cuda amd reuse~~ Add Event Caching for CUDA and HIP May 14, 2024

MartinWehking changed the title ~~Add Event Caching for CUDA and HIP~~ [CUDA][HIP] Add Event Caching May 14, 2024

hdelan reviewed May 15, 2024

View reviewed changes

hdelan approved these changes May 15, 2024

View reviewed changes

MartinWehking commented May 15, 2024

View reviewed changes

MartinWehking closed this May 21, 2024

MartinWehking mentioned this pull request May 21, 2024

Cache events instead of destroying them for CUDA #1523

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUDA][HIP] Add Event Caching #1538

[CUDA][HIP] Add Event Caching #1538

Uh oh!

MartinWehking commented Apr 22, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hdelan May 15, 2024

Uh oh!

hdelan commented May 15, 2024

Uh oh!

MartinWehking May 15, 2024 •

edited

Loading

Uh oh!

MartinWehking May 15, 2024

Uh oh!

MartinWehking commented May 21, 2024 •

edited

Loading

Uh oh!

Uh oh!

[CUDA][HIP] Add Event Caching #1538

[CUDA][HIP] Add Event Caching #1538

Uh oh!

Conversation

MartinWehking commented Apr 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hdelan May 15, 2024

Choose a reason for hiding this comment

Uh oh!

hdelan commented May 15, 2024

Uh oh!

MartinWehking May 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinWehking May 15, 2024

Choose a reason for hiding this comment

Uh oh!

MartinWehking commented May 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

MartinWehking commented Apr 22, 2024 •

edited

Loading

MartinWehking May 15, 2024 •

edited

Loading

MartinWehking commented May 21, 2024 •

edited

Loading