Skip to content

[SYCL][CUDA] Support host-device memcpy2D #8181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 21, 2023

Conversation

abagusetty
Copy link
Contributor

@abagusetty abagusetty commented Feb 2, 2023

Addresses to support host-device memcpy2D copies

@abagusetty abagusetty requested review from a team as code owners February 2, 2023 12:28
@abagusetty abagusetty temporarily deployed to aws February 2, 2023 12:49 — with GitHub Actions Inactive
@abagusetty abagusetty temporarily deployed to aws February 2, 2023 13:33 — with GitHub Actions Inactive
? reinterpret_cast<CUdeviceptr>(src_ptr)
: 0;
cpyDesc.srcHost = (src_type == CU_MEMORYTYPE_HOST) ? src_ptr : nullptr;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this behavior is becoming more complex, would it make sense to split it into a helper function so we can avoid repeating the code? I can see two ways of doing it:

  1. Implement it as a function returning a triple, i.e. something like:
template<typename PtrT>
static std::tuple<CUmemorytype, CUdeviceptr, PtrT> getUSMHostOrDevicePtr(PtrT usm_ptr) { ... }
...
std::tie(cpyDesc.srcMemoryType, cpyDesc.srcDevice, cpyDesc.srcHost) = getUSMHostOrDevicePtr(src_ptr);
...
std::tie(cpyDesc.dstMemoryType, cpyDesc.dstDevice, cpyDesc.dstHost) = getUSMHostOrDevicePtr(dst_ptr);
  1. Do the same, but make it void and pass the members to set as pointers/references:
template<typename PtrT>
static void getUSMHostOrDevicePtr(PtrT usm_ptr, CUmemorytype *out_mem_type, CUdeviceptr *out_dev_ptr, PtrT *out_host_ptr) { ... }
...
getUSMHostOrDevicePtr(src_ptr, &cpyDesc.srcMemoryType, &cpyDesc.srcDevice, &cpyDesc.srcHost);
...
getUSMHostOrDevicePtr(dst_ptr, &cpyDesc.dstMemoryType, &cpyDesc.dstDevice, &cpyDesc.dstHost);

Note I use pointers above because I like to think it makes the intention clearer here, but I do not have a strong preference for one or the other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I agree.

"ARRAY, UNIFIED types are not supported!");

// pointer not known to the CUDA subsystem (possibly a system allocated ptr)
if (ret == CUDA_ERROR_INVALID_VALUE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to check only for two possible error codes from cuPointerGetAttribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I could classify them into CUDA_ERROR_INVALID_VALUE since that is the only way to check for host-ptrs, CUDA_SUCCESS or not that asserts through the existing macro. Good tip.

@abagusetty abagusetty temporarily deployed to aws February 2, 2023 14:14 — with GitHub Actions Inactive
@abagusetty abagusetty marked this pull request as draft February 15, 2023 22:16
@abagusetty abagusetty marked this pull request as ready for review February 18, 2023 04:33
@abagusetty
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1597

@bader bader temporarily deployed to aws February 18, 2023 20:45 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws February 18, 2023 21:17 — with GitHub Actions Inactive
@abagusetty abagusetty requested review from jchlanda and removed request for sergey-semenov February 20, 2023 12:39
Copy link
Contributor

@jchlanda jchlanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@steffenlarsen steffenlarsen changed the title [SYCL][CUDA] Supports host-device memcpy2D [SYCL][CUDA] Support host-device memcpy2D Feb 21, 2023
@steffenlarsen steffenlarsen merged commit d0b25d4 into intel:sycl Feb 21, 2023
@abagusetty abagusetty deleted the fix_cuda_memcpy2d branch February 21, 2023 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants