Skip to content

[SYCL] Reuse discarded L0 events in scope of command list #7256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Nov 18, 2022

Conversation

againull
Copy link
Contributor

@againull againull commented Nov 2, 2022

Patch implements reset and reuse of Level Zero events in scope of cmd list. Same level zero events are reused inside different command lists in scope of the queue, we need new event only to switch between command lists.

The scheme in scope of command list looks like this:
Operation1 = zeCommantListAppendMemoryCopy (signal event1)

zeCommandListAppendBarrier(wait for event1)
zeCommandListAppendEventReset(event1)
Operation2 = zeCommandListAppendMemoryCopy (signal event2)

zeCommandListAppendBarrier(wait for event2)
zeCommandListAppendEventReset(event2)
Operation3 = zeCommandListAppendMemoryCopy (signal event1)

If we switch to a different command list then we signal new event and insert a barrier into new command list waiting on that event.

CmdList1:
Operation1 = zeCommantListAppendMemoryCopy (signal event1)

zeCommandListAppendBarrier(wait for event1)
zeCommandListAppendEventReset(event1)
zeCommandListAppendSignalEvent(NewEvent)

CmdList2:
zeCommandListAppendBarrier(wait for NewEvent)

Patch implements reset and reuse of Level Zero events in scope of cmd
list. Same level zero events are reused inside different command lists in
scope of the queue, we need new event only to switch between command
lists.

The scheme in scope of command list looks like this:
Operation1 = zeCommantListAppendMemoryCopy (signal event1)

zeCommandListAppendBarrier(wait for event1)
zeCommandListAppendEventReset(event1)
Operation2 = zeCommandListAppendMemoryCopy (signal event2)

zeCommandListAppendBarrier(wait for event2)
zeCommandListAppendEventReset(event2)
Operation3 = zeCommandListAppendMemoryCopy (signal event1)

If we switch to a different command list then we signal new event
and insert a barrier into new command list waiting or that event.

CmdList1:
Operation1 = zeCommantListAppendMemoryCopy (signal event1)

zeCommandListAppendBarrier(wait for event1)
zeCommandListAppendEventReset(event1)
zeCommandListAppendSignalEvent(NewEvent)

CmdList2:
zeCommandListAppendBarrier(wait for NewEvent)
@againull againull requested a review from a team as a code owner November 2, 2022 18:04
@againull againull requested review from sergey-semenov and smaslov-intel and removed request for sergey-semenov November 2, 2022 18:04
@againull againull force-pushed the reuse_events_in_cmd_list branch from 442f937 to 41d518c Compare November 9, 2022 17:12
@againull
Copy link
Contributor Author

@smaslov-intel Could you please review.

// events in the in-order queue with discard_events property.
static const bool ReuseDiscardedEvents = [] {
const char *ReuseDiscardedEventsFlag =
std::getenv("SYCL_PI_LEVEL_ZERO_REUSE_DISCARDED_EVENTS");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this env var

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented.

Comment on lines 670 to 686
// Create new pi_event but with the same ze_event_handle_t. We are going
// to use this pi_event for the next command with discarded event.
pi_event PiEvent;
try {
PiEvent = new _pi_event(LastCommandEvent->ZeEvent,
LastCommandEvent->ZeEventPool, Context,
PI_COMMAND_TYPE_USER, true);
} catch (const std::bad_alloc &) {
return PI_ERROR_OUT_OF_HOST_MEMORY;
} catch (...) {
return PI_ERROR_UNKNOWN;
}

if (LastCommandEvent->isHostVisible())
PiEvent->HostVisibleEvent = PiEvent;

PI_CALL(addEventToCache(PiEvent));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code snippet does not logically belong to this method called "appendWaitAndReset", but I see how it is convenient to do it here where you know the "LastCommandEvent->ZeEvent". Maybe rename function to like "resetDiscardedEvent"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, renamed.

// any command but its ZeEvent is used by many pi_event objects.
// Commands to wait and reset ZeEvent must be submitted to the queue before
// calling this method.
pi_result addEventToCache(pi_event Event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pi_result addEventToCache(pi_event Event);
pi_result addEventToQueueCache(pi_event Event);

and rename in pi_context to "addEventToContextCache" (for readability)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, renamed.

this->LastCommandEvent = CommandList->second.EventList.back();
if (ReuseDiscardedEvents && isInOrderQueue() && isDiscardEvents()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a helper query for multiple repeated conditions:

Suggested change
if (ReuseDiscardedEvents && isInOrderQueue() && isDiscardEvents()) {
if (doReuseDiscardedEvents) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a helper method.

Comment on lines 1788 to 1798
if (ReuseDiscardedEvents && isInOrderQueue() && isDiscardEvents()) {
// If we have in-order queue with discarded events then we want to
// treat this event as regular event. We insert a barrier in the next
// command list to wait for this event.
LastCommandEvent = HostVisibleEvent;
} else {
// For all other queues treat this as a special event and indicate no
// cleanup is needed.
PI_CALL(piEventReleaseInternal(HostVisibleEvent));
HostVisibleEvent->CleanedUp = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change to treat it "regular" in all cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can, I added TODO for this because it may require test changes as well.

// signal new event from the last immediate command list. We are going
// to insert a barrier in the new command list waiting for that event.
auto QueueGroup = CurQueue->getQueueGroup(UseCopyEngine);
auto NextImmCmdList = QueueGroup.ImmCmdLists[QueueGroup.NextIndex];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fragile to use "NextIndex" directly here, maybe add a boolean to getQueueIndex that we just want to know the index (but not advance it)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, fixed as you suggested.

}
} else {
// Ensure LastCommandEvent's batch is submitted if it is differrent
// from the one this command is going to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also comment that the "signalEventFromCmdListIfLastEventDiscarded" will be called at batch close if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

Comment on lines 2116 to 2121
// If the last event is discarded then we already have a barrier waiting for
// that event, so don't need to include the last command event into the wait
// list.
if (ReuseDiscardedEvents && CurQueue->isDiscardEvents() &&
CurQueue->LastCommandEvent->IsDiscarded)
IncludeLastCommandEvent = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an optimization? Is it important? I'd prefer we don't do that (at least like this) because it relies on other places to properly add the barrier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is important, because we reset discarded event, so we can't add it into the wait list -> it will cause waiting for reset event. I added a clarifying comment. Unfortunately, currently I don't see how to get rid of this part.

@againull againull requested a review from a team as a code owner November 18, 2022 00:40
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@againull
Copy link
Contributor Author

SYCL :: GroupAlgorithm/SYCL2020/sort.cpp failure on HIP/CUDA is known.
Expected to be fixed by intel/llvm-test-suite#1396

@againull againull merged commit b1533c5 into intel:sycl Nov 18, 2022
@againull againull deleted the reuse_events_in_cmd_list branch December 2, 2022 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants