Skip to content

[OpenCL] Add support for cslice partitioning #7963

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

asudarsa
Copy link
Contributor

@asudarsa asudarsa commented Jan 9, 2023

Recently, an extension was added here: #7513
This extension allows partitioning a device by "cslice" (aka CCS-es).
With this change, on PVC sub-sub-devices now require info::partition_property::ext_intel_partition_by_cslice instead of info::partition_property::partition_by_affinity_domain that wasn't quite accurately describing the actual scheme.
Level Zero backend support was added here: #7626
In this change, we are adding support in OpenCL plugin.

This change makes use of the cl_intel_command_queue_families recently added.

Thanks
Signed-off-by: Arvind Sudarsanam arvind.sudarsanam@intel.com

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa requested a review from a team as a code owner January 9, 2023 16:57
@asudarsa asudarsa marked this pull request as draft January 9, 2023 16:57
@asudarsa
Copy link
Contributor Author

asudarsa commented Jan 9, 2023

Marking it as a draft to allow for initial review.

@asudarsa asudarsa requested a review from bashbaug January 9, 2023 16:58
@asudarsa asudarsa temporarily deployed to aws January 9, 2023 17:05 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@aelovikov-intel
Copy link
Contributor

Do you have a test for this already?

Comment on lines 283 to 298
cl_device_id parentId = nullptr;
clGetDeviceInfo(cast<cl_device_id>(device), CL_DEVICE_PARENT_DEVICE,
sizeof(cl_device_id), &parentId, NULL);
if (parentId == nullptr)
return 0;
cl_device_id parentParentId = nullptr;
clGetDeviceInfo(parentId, CL_DEVICE_PARENT_DEVICE, sizeof(cl_device_id),
&parentParentId, NULL);
if (parentParentId == nullptr)
return 1;
cl_device_id parentParentParentId = nullptr;
clGetDeviceInfo(parentParentId, CL_DEVICE_PARENT_DEVICE, sizeof(cl_device_id),
&parentParentParentId, NULL);
if (parentParentParentId == nullptr)
return 2;
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd rather see a generic implementation (i.e., while loop) with an assert that the result is <=2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the levels to use an enum. So, I am leaving code as is (no while loop). Hope that's ok. Thanks

Comment on lines 391 to 392
if (device->subLevel == -1)
device->subLevel = getSubLevel(device);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes little sense to me. Why don't we report an error immediately for the invalid level? Also, I've realized that I'd like to see a named const/constexpr variable for "-1".

Copy link
Contributor Author

@asudarsa asudarsa Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, -1 (or INVALID in the upcoming patch) means that we have not yet queried the backend for the device level info. We will make a call to 'getLevel' only once.

clGetDeviceInfo(
cast<cl_device_id>(device), CL_DEVICE_PARTITION_MAX_SUB_DEVICES,
sizeof(partitionMaxSubDevices), &partitionMaxSubDevices, nullptr);
} else if (device->isSubDevice()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What else can it be if it's not a root device (false at line 393)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be a sub-device or sub-sub device created using calls to CreateSubDevice by user

cast<cl_device_id>(device), CL_DEVICE_PARTITION_MAX_SUB_DEVICES,
sizeof(partitionMaxSubDevices), &partitionMaxSubDevices, nullptr);
} else if (device->isSubDevice()) {
// find out number of CCSes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's already a sub-sub-device ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it will just return 0 in out_num_devices

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to be more explicit about this. Something like

switch (level) {
  case 2: return no_further_partitioning
  case 0: return delegate_to_opencl
  case 1: { our logic }
  default:
    assert/PI_INVALID_ERROR/etc.
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified in upcoming patch. Thanks

device->subLevel = getSubLevel(device);
if (device->isRootDevice()) {
clGetDeviceInfo(
cast<cl_device_id>(device), CL_DEVICE_PARTITION_MAX_SUB_DEVICES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast<cl_device_id>(device) - I'm confused here. It looks like the handle is from the OpenCL RT and is opaque for us, yet we do device->subLevel above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handle (device) is a 'pi_device' handle. This way of passing the pi_device into OpenCL APIs seems like a standard way. I am not too sure about the 'internals' of OpenCL runtime. @bashbaug, can you please comment? Thanks

clGetDeviceInfo(
cast<cl_device_id>(device), CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTEL,
3*sizeof(cl_queue_family_properties_intel), qfprops, &qsize);
qsize = qsize/sizeof(cl_queue_family_properties_intel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clang-formated, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran clang-format before submission. Which line do you think may be misformatted? Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's specific clang-format configuration for the OpenCL plugin. The rest of the codebase would have spaces around / and no spaces after ( and before ) in the for loop below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a manual change for these. Not sure how I can catch them otherwise. Thanks

cast<cl_device_id>(device), CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTEL,
3*sizeof(cl_queue_family_properties_intel), qfprops, &qsize);
qsize = qsize/sizeof(cl_queue_family_properties_intel);
for ( size_t q = 0; q < qsize; q++ ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran clang-format before submission. Which line do you think may be misformatted? Thanks

Comment on lines 422 to 424
}
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we can employ early return above to reduce this nesting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changing to switch/case reduced the nesting level. Thanks

return ReturnValue(pi_device_partition_property{0});
}
return ReturnHelper(PI_DEVICE_PARTITION_BY_AFFINITY_DOMAIN);
} else if (device->isSubDevice()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for else as the above if early-returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 532 to 533
}
for (uint32_t i = 0; i < *out_num_devices; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't it be a single loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be. Just an incremental add I think. I will fuse them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fused. Thanks

Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misclicked in the previous review accidentally selecting "approved". Hopefully this will reset it.

@asudarsa asudarsa temporarily deployed to aws January 9, 2023 18:36 — with GitHub Actions Inactive
@asudarsa
Copy link
Contributor Author

asudarsa commented Jan 9, 2023

Sorry, I misclicked in the previous review accidentally selecting "approved". Hopefully this will reset it.

Got it. Thanks so much for the feedback Andrei. Will make changes soon.

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa
Copy link
Contributor Author

I will add a test very soon. Thanks

Comment on lines 402 to 404
pi_uint32 numSubDevices = 0;
if (device->level == _pi_device::INVALID) // level not yet updated
device->level = getLevel(device);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this lazy? Can we update when device is created?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually do update this when device is created. So, getLevel will mostly not be called here. I am just using this as a fail-safe. But, thinking of it, I am not able to formulate a scenario where device->level can be invalid once it has been created. I can take out this code if needed. Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have an assert here then. The laziness complicates the code unnecessarily.

Comment on lines 468 to 471
if (device->level == _pi_device::INVALID) // level not updated yet
device->level = getLevel(device);
// For root-device there is no partitioning to report.
if (device->isRootDevice())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should either drop lines 468/469 (getLevel) or convert ifs into a switch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah..good catch. I will drop lines 468/469.

return ReturnHelper(PI_DEVICE_PARTITION_BY_AFFINITY_DOMAIN);
}
case _pi_device::SUBDEVICE: {
// find out number of CCSes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// find out number of CCSes
// Find out number of CCSes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be done in next patch. Thanks

Comment on lines 433 to 435
std::cout
<< "This device does not support cl_intel_command_queue_families"
<< std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like those debug prints need to be removed/guarded by debug macros.

qsize = qsize / sizeof(cl_queue_family_properties_intel);
for (size_t q = 0; q < qsize; q++) {
if (qfprops[q].capabilities == CL_QUEUE_DEFAULT_CAPABILITIES_INTEL &&
qfprops[q].count > numSubDevices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this condition mean? Can we have numSubDevices not equal to zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covering a scenario where multiple families have 'DEFAULT' capabilities and we are trying to get the handle to the family with maximum number of dub-devices.

}
// Absorb the CL_DEVICE_NOT_FOUND and just return 0 in out_num_devices
if (result == CL_DEVICE_NOT_FOUND) {
std::cout << "Device not found\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More debug prints that need a cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. removed it. Thanks

@@ -381,6 +583,9 @@ pi_result piDevicesGet(pi_platform platform, pi_device_type device_type,
cast<cl_uint>(num_entries), cast<cl_device_id *>(devices),
cast<cl_uint *>(num_devices));

for (pi_uint32 i = 0; i < num_entries; ++i) {
devices[i]->level = _pi_device::ROOTDEVICE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this in place I'm really surprised we need lazy initialization for this ->level in the code earlier in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. However, earlier in the code, we will actually not call the getLevel function as we are already setting the level here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have actually removed getLevel function. We can populate the level info at the time of device creation.

Comment on lines 35 to 49
// A single-threaded app has an opportunity to enable this mode to avoid
// overhead from mutex locking. Default value is 0 which means that single
// thread mode is disabled.
static const bool SingleThreadMode = [] {
const char *Ret = std::getenv("SYCL_PI_OPENCL_SINGLE_THREAD_MODE");
const bool RetVal = Ret ? std::stoi(Ret) : 0;
return RetVal;
}();

// Class which acts like shared_mutex if SingleThreadMode variable is not set.
// If SingleThreadMode variable is set then mutex operations are turned into
// nop.
class pi_shared_mutex_ocl {
std::shared_mutex Mutex;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this brand new or copy-pasted from level zero plugin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the copy-paste code and added the header file from unified runtime.

Comment on lines 232 to 234
ROOTDEVICE = 0,
SUBDEVICE = 1,
SUBSUBDEVICE = 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do some magic/arithmetic inside getLevel in the code at the top of the PR's diff and then just cast it back to the enum. Not insisting though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you are referring to nextLevel? I am not sure how getLevel can be improved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both, actually. We can calculate the depth of the device hierarchy in a loop and then cast it to this enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise to keep the code as is for better readability. Hope that's ok.

_pi_device(pi_platform Plt) : Platform{Plt} {
level = INVALID;
family = index = 0;
// NOTE: one must additionally call initialize() to complete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see _pi_device::initialize here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah..the comment needs to be deleted.

@aelovikov-intel
Copy link
Contributor

I'm not sure how I feel about directly copy-pasting level zero plugin's infrastructure here. Adding @smaslov-intel for his thoughts.

@asudarsa asudarsa temporarily deployed to aws January 10, 2023 01:33 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws January 10, 2023 03:49 — with GitHub Actions Inactive
@@ -0,0 +1,79 @@
//===--------- pi_utils.hpp - Plugin Utility Functions -------------------===//
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pi_utils.hpp will be removed. This code is available from unified runtime. Thanks.

pi_uint32 index; // SYCL queue index inside a given family of queues

bool isRootDevice(void) { return level == ROOTDEVICE; }
bool isSubDevice(void) { return level == SUBDEVICE; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been more than a week since my review, so I've forgotten some amount of context, but looking into the conversation at #7963 (comment) I'm realizing that I don't particularly like this interface.

sub-sub-device->isSubDevice is ambiguous in its meaning. It is a "sub-device" of a "sub-device", after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I agree. I was not able to think of any clearer/easier way to represent this. May be a comment can help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed isSubSubDevice to isCCS.

@asudarsa
Copy link
Contributor Author

Do you have a test for this already?

Just added for review.

@asudarsa
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1530

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa temporarily deployed to aws January 31, 2023 18:15 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws February 1, 2023 02:22 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws February 1, 2023 02:54 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@@ -1747,8 +2116,20 @@ pi_result piextKernelGetNativeHandle(pi_kernel kernel,
// This API is called by Sycl RT to notify the end of the plugin lifetime.
// TODO: add a global variable lifetime management code here (see
// pi_level_zero.cpp for reference) Currently this is just a NOOP.
// We clear all the 'map' variables here.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the code to clean up all the maps that we are populating in the plugin. I also delete the pi_device handles in the cslice_devices list if they have not been deleted yet.
Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah..the code cleanup needs to be done in a different way. I will address this in the next PR. Thanks

@asudarsa asudarsa temporarily deployed to aws February 1, 2023 14:59 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa temporarily deployed to aws February 1, 2023 16:07 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws February 1, 2023 17:27 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa temporarily deployed to aws February 1, 2023 18:41 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws February 1, 2023 19:19 — with GitHub Actions Inactive
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are additional enhancements we should make but I'm fine adding them in a subsequent PR. Thanks!

paramValueSize, paramValue, paramValueSizeRet);
return static_cast<pi_result>(result);
} else {
// For root-device there is no partitioning to report.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much similarities between the plugins in this code actually, but if Ben is fine with postponing the cleanup, so am I.

@@ -815,11 +1087,16 @@ pi_result piContextCreate(const pi_context_properties *properties,
size_t cb, void *user_data1),
void *user_data, pi_context *retcontext) {
pi_result ret = PI_ERROR_INVALID_OPERATION;
std::vector<cl_device_id> cl_devices = getClDevices(num_devices, devices);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to have duplicates here? Although I think the helper itself has to be changed, if it's not ok.

Comment on lines +1095 to +1097
std::vector<pi_device> device_list_vec(num_devices);
for (size_t i = 0; i < num_devices; ++i)
device_list_vec[i] = devices[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think

template< class InputIt >
vector( InputIt first, InputIt last,
        const Allocator& alloc = Allocator() );

ctor should work just fine here.

Comment on lines +1114 to +1116
std::vector<pi_device> device_list_vec(num_devices);
for (size_t i = 0; i < num_devices; ++i)
device_list_vec[i] = devices[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Likewise.

Comment on lines +1134 to +1135
if (context2devlist.find(context) != context2devlist.end()) {
auto devlist = context2devlist[context];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
if (context2devlist.find(context) != context2devlist.end()) {
auto devlist = context2devlist[context];
if (auto it = context2devlist.find(context); it != context2devlist.end()) {
auto devlist = *it;

@asudarsa
Copy link
Contributor Author

asudarsa commented Feb 7, 2023

Hi @againull

Can you please help to merge this? latest comments will be addressed in an upcoming PR.

Thanks

@againull againull merged commit dcbd6e5 into intel:sycl Feb 7, 2023
@dm-vodopyanov
Copy link
Contributor

dm-vodopyanov commented Feb 8, 2023

@asudarsa this patch fails post-commit on Windows and macOS. Can you please fix this ASAP?

https://github.com/intel/llvm/actions/runs/4116506385/jobs/7106640407
https://github.com/intel/llvm/actions/runs/4116506385/jobs/7106643190

@bader FYI, revert might be needed in case these failures came from unified runtime - to return post-commit to (partially) green state as soon as possible.

dm-vodopyanov added a commit that referenced this pull request Feb 8, 2023
@asudarsa
Copy link
Contributor Author

asudarsa commented Feb 8, 2023

@asudarsa this patch fails post-commit on Windows and macOS. Can you please fix this ASAP?

https://github.com/intel/llvm/actions/runs/4116506385/jobs/7106640407 https://github.com/intel/llvm/actions/runs/4116506385/jobs/7106643190

@bader FYI, revert might be needed in case these failures came from unified runtime - to return post-commit to (partially) green state as soon as possible.

Hi Dmitry,

Some of the fails seem unrelated to my change. The only issue that seems to be related is 'zer_api.h' not being found. I am not sure why that issue does not occur in Linux. Can you please verify, if possible, if the post-commit builds and tests pass without my change, before actually reverting it?

Thanks

@bader
Copy link
Contributor

bader commented Feb 8, 2023

@asudarsa, these post-commit steps were clean before your commit and stably fail after it. I also looked at the error message and it points to the code modified by this patch.
I'm 100% sure that fails are caused by this patch.

@bader
Copy link
Contributor

bader commented Feb 9, 2023

@asudarsa, do you have any ETA for resolving post-commit test issues?
@dm-vodopyanov created #8251 to revert this change and @cperkinsintel approved it. We can revert it and you can push fixed version of the patch later, but I prefer to push the fix if it's ready.

@cperkinsintel
Copy link
Contributor

This PR is also resulting in a failure in llvm-test-suite/SYCL/Plugin/interop-opencl.cpp in the ACC/FPGA case. Our GHA testing doesn't include an ACC case so this was missed.

Running llvm-test-suite would have detected both that failure and the windows building one.

@bader
Copy link
Contributor

bader commented Feb 9, 2023

Okay, I think we should revert the change and do more testing locally.
@dm-vodopyanov, @cperkinsintel, thanks for helping with this.

bader pushed a commit that referenced this pull request Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants