Skip to content

[UR][L0] Propagate OOM errors from USMAllocationMakeResident #11696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from

Conversation

0x12CC
Copy link
Contributor

@0x12CC 0x12CC commented Oct 27, 2023

Do not merge. This PR is used to test changes in Unified Runtime.

Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC requested a review from a team as a code owner October 27, 2023 17:28
@0x12CC 0x12CC marked this pull request as draft October 27, 2023 17:29
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 27, 2023 17:44 — with GitHub Actions Inactive
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 27, 2023 18:47 — with GitHub Actions Inactive
@0x12CC
Copy link
Contributor Author

0x12CC commented Oct 27, 2023

********************
Failed Tests (25):
  SYCL :: Matrix/Legacy/XMX8/element_wise_all_ops_bf16.cpp
  SYCL :: Matrix/Legacy/XMX8/element_wise_all_ops_half.cpp
  SYCL :: Matrix/Legacy/XMX8/element_wise_irreg_sum_rows.cpp
  SYCL :: Matrix/Legacy/XMX8/element_wise_ops.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_bf16.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_bfloat16.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_half.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_ss_int8.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_su_int8.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_us_int8.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_uu_int8.cpp
  SYCL :: Matrix/XMX8/element_wise_abc.cpp
  SYCL :: Matrix/XMX8/element_wise_all_ops_half.cpp
  SYCL :: Matrix/XMX8/element_wise_ops.cpp
  SYCL :: Matrix/XMX8/get_coord_float_matC.cpp
  SYCL :: Matrix/XMX8/get_coord_int8_matA.cpp
  SYCL :: Matrix/XMX8/joint_matrix_all_sizes.cpp
  SYCL :: Matrix/XMX8/joint_matrix_apply_bf16.cpp
  SYCL :: Matrix/XMX8/joint_matrix_bfloat16.cpp
  SYCL :: Matrix/XMX8/joint_matrix_bfloat16_array.cpp
  SYCL :: Matrix/XMX8/joint_matrix_half.cpp
  SYCL :: Matrix/XMX8/joint_matrix_ss_int8.cpp
  SYCL :: Matrix/XMX8/joint_matrix_su_int8.cpp
  SYCL :: Matrix/XMX8/joint_matrix_us_int8.cpp
  SYCL :: Matrix/XMX8/joint_matrix_uu_int8.cpp

@YuriPlyakhin, do you know why these matrix tests might be failing? I'm testing 0x12CC/unified-runtime@f2be823 but I'm not able to reproduce these failures locally.

Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 18:22 — with GitHub Actions Inactive
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 18:45 — with GitHub Actions Inactive
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 19:07 — with GitHub Actions Inactive
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 19:31 — with GitHub Actions Inactive
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 19:55 — with GitHub Actions Inactive
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 21:31 — with GitHub Actions Inactive
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 30, 2023 21:53 — with GitHub Actions Inactive
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC changed the title [UR][L0] Propagate errors from USMAllocationMakeResident [UR][L0] Propagate OOM errors from USMAllocationMakeResident Oct 31, 2023
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 31, 2023 18:21 — with GitHub Actions Inactive
@0x12CC 0x12CC temporarily deployed to WindowsCILock October 31, 2023 19:06 — with GitHub Actions Inactive
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@0x12CC 0x12CC temporarily deployed to WindowsCILock November 2, 2023 17:16 — with GitHub Actions Inactive
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
Signed-off-by: Michael Aziz <michael.aziz@intel.com>
@kbenzie
Copy link
Contributor

kbenzie commented Nov 8, 2023

Superceed by #11811

@kbenzie kbenzie closed this Nov 8, 2023
@0x12CC 0x12CC deleted the l0_usm_error_checking_2 branch November 8, 2023 14:47
@YuriPlyakhin
Copy link
Contributor

********************
Failed Tests (25):
  SYCL :: Matrix/Legacy/XMX8/element_wise_all_ops_bf16.cpp
  SYCL :: Matrix/Legacy/XMX8/element_wise_all_ops_half.cpp
  SYCL :: Matrix/Legacy/XMX8/element_wise_irreg_sum_rows.cpp
  SYCL :: Matrix/Legacy/XMX8/element_wise_ops.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_bf16.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_bfloat16.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_half.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_ss_int8.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_su_int8.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_us_int8.cpp
  SYCL :: Matrix/Legacy/XMX8/joint_matrix_uu_int8.cpp
  SYCL :: Matrix/XMX8/element_wise_abc.cpp
  SYCL :: Matrix/XMX8/element_wise_all_ops_half.cpp
  SYCL :: Matrix/XMX8/element_wise_ops.cpp
  SYCL :: Matrix/XMX8/get_coord_float_matC.cpp
  SYCL :: Matrix/XMX8/get_coord_int8_matA.cpp
  SYCL :: Matrix/XMX8/joint_matrix_all_sizes.cpp
  SYCL :: Matrix/XMX8/joint_matrix_apply_bf16.cpp
  SYCL :: Matrix/XMX8/joint_matrix_bfloat16.cpp
  SYCL :: Matrix/XMX8/joint_matrix_bfloat16_array.cpp
  SYCL :: Matrix/XMX8/joint_matrix_half.cpp
  SYCL :: Matrix/XMX8/joint_matrix_ss_int8.cpp
  SYCL :: Matrix/XMX8/joint_matrix_su_int8.cpp
  SYCL :: Matrix/XMX8/joint_matrix_us_int8.cpp
  SYCL :: Matrix/XMX8/joint_matrix_uu_int8.cpp

@YuriPlyakhin, do you know why these matrix tests might be failing? I'm testing 0x12CC/unified-runtime@f2be823 but I'm not able to reproduce these failures locally.

@0x12CC , for some reason I saw this question only now. Please, let me know if you still have the problem.

@0x12CC
Copy link
Contributor Author

0x12CC commented Nov 8, 2023

Please, let me know if you still have the problem.

This is not a problem anymore. The cause was an error result returned from zeContextMakeMemoryResident that we don't handle. oneapi-src/level-zero-spec#240 was created to track this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants