Skip to content

Conversation

@mmichel11
Copy link
Collaborator

queue::ext_oneapi_get_state() does not work for two reasons:

  • The implementation only considers the case where the command graph is set. To support native recording, we can call urQueueIsGraphCaptureEnabledExp. If the result is not success, we know we are not recording as the API is unsupported. Otherwise, we check the result.
  • urQueueIsGraphCaptureEnabledExp does not function properly as it does not handle command list fork-join. We should fix this by querying the L0 API instead of duplicating the logic throughout UR to track queue state transition.

@mmichel11
Copy link
Collaborator Author

Opened a PR in intel/llvm for the UR fix + simple fork-join test: intel#21145

@mmichel11
Copy link
Collaborator Author

Some more context on the last commit: NEO has changed fork-join recording behavior with L0 graph.

  • Previous behavior: Queue2 submits kernel with dependent event on Queue1 causing a fork and transition into recording for Queue2. Join back onto Queue1 causes Queue2 to transition back to an executing state.
  • New behavior: Queue2 submits kernel with dependent event on Queue1 causing a fork and transition into recording for Queue2. Queue2 stops recording only when Queue1 does.

With the new behavior, we should be able to just flip the assert value check but it changes with driver version and is messy to handle, so just asserting that both queues are executing after Queue1 stops recording makes the most sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants