Skip to content

[SYCL][L0] Disable native 2D USM memcpy by default #8609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions sycl/doc/EnvironmentVariables.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,7 @@ variables in production code.</span>
| `SYCL_PI_LEVEL_ZERO_EXPOSE_CSLICE_IN_AFFINITY_PARTITIONING` (Deprecated) | Integer | When set to non-zero value exposes compute slices as sub-sub-devices in `sycl::info::partition_property::partition_by_affinity_domain` partitioning scheme. Default is zero meaning that they are only exposed when partitioning by `sycl::info::partition_property::ext_intel_partition_by_cslice`. This option is introduced for compatibility reasons and is immediately deprecated. New code must not rely on this behavior. Also note that even if sub-sub-device was created using `partition_by_affinity_domain` it would still be reported as created via partitioning by compute slices. |
| `SYCL_PI_LEVEL_ZERO_IMMEDIATE_COMMANDLISTS_EVENT_CLEANUP_THRESHOLD` | Integer | If non-negative then the threshold is set to this value. If negative, the threshold is set to INT_MAX. Whenever the number of events associated with an immediate command list exceeds this threshold, a check is made for signaled events and these events are recycled. Setting this threshold low causes events to be checked more often, which could result in unneeded events being recycled sooner. However, more frequent event status checks may cost time. The default is 20. |
| `SYCL_PI_LEVEL_ZERO_USM_RESIDENT` | Integer | Controls if/where to make USM allocations resident at the time of allocation. If set to 0 (default) then no special residency is forced. If set to 1 then allocation (device or shared) is made resident at the device of allocation. If set to 2 then allocation (device or shared) is made resident on all devices in the context of allocation that have P2P access to the device of allocation. For host allocation, any non-0 setting forces the allocation resident on all devices in the context. |
| `SYCL_PI_LEVEL_ZERO_USE_NATIVE_USM_MEMCPY2D` | Integer | When set to a positive value enables the use of Level Zero USM 2D memory copy operations. Default is 0. |

## Debugging variables for CUDA Plugin

Expand Down
15 changes: 13 additions & 2 deletions sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,16 @@ static const bool IndirectAccessTrackingEnabled = [] {
nullptr;
}();

// Due to a bug with 2D memory copy to and from non-USM pointers, this option is
// disabled by default.
static const bool UseMemcpy2DOperations = [] {
const char *UseMemcpy2DOperationsFlag =
std::getenv("SYCL_PI_LEVEL_ZERO_USE_NATIVE_USM_MEMCPY2D");
if (!UseMemcpy2DOperationsFlag)
return false;
return std::stoi(UseMemcpy2DOperationsFlag) > 0;
}();

static usm_settings::USMAllocatorConfig USMAllocatorConfigInstance;

// Map from L0 to PI result.
Expand Down Expand Up @@ -2291,8 +2301,9 @@ pi_result piContextGetInfo(pi_context Context, pi_context_info ParamName,
case PI_CONTEXT_INFO_REFERENCE_COUNT:
return ReturnValue(pi_uint32{Context->RefCount.load()});
case PI_EXT_ONEAPI_CONTEXT_INFO_USM_MEMCPY2D_SUPPORT:
// 2D USM memcpy is supported.
return ReturnValue(pi_bool{true});
// 2D USM memcpy is supported unless disabled through
// SYCL_PI_LEVEL_ZERO_USE_NATIVE_USM_MEMCPY2D.
return ReturnValue(pi_bool{UseMemcpy2DOperations});
case PI_EXT_ONEAPI_CONTEXT_INFO_USM_FILL2D_SUPPORT:
case PI_EXT_ONEAPI_CONTEXT_INFO_USM_MEMSET2D_SUPPORT:
// 2D USM fill and memset is not supported.
Expand Down