Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static link CUDA Runtime #7954

Merged
merged 25 commits into from
Nov 16, 2023
Merged

Static link CUDA Runtime #7954

merged 25 commits into from
Nov 16, 2023

Conversation

kmaehashi
Copy link
Member

@kmaehashi kmaehashi commented Oct 21, 2023

This is a part of #7620.

@kmaehashi kmaehashi added cat:enhancement Improvements to existing features no-compat Disrupts backward compatibility blocking Issue/pull-request is mandatory for the upcoming release prio:high labels Oct 21, 2023
@kmaehashi
Copy link
Member Author

/test mini

@kmaehashi
Copy link
Member Author

/test mini

@kmaehashi
Copy link
Member Author

@leofang
This code relies on locally-installed CUDA Toolkit version (specifically cooperative_groups.h). Wondering how we can rewrite this...

cupy/cupyx/jit/cg.py

Lines 102 to 103 in 1ef4036

if _runtime.runtimeGetVersion() < 11060:
raise RuntimeError("block_rank() is supported on CUDA 11.6+")

@kmaehashi
Copy link
Member Author

This code relies on locally-installed CUDA Toolkit version (specifically cooperative_groups.h).

Fixed in e89e35a. Could you review this commit @asi1024?

@leofang
Copy link
Member

leofang commented Oct 23, 2023

Yes it should be a compile time error. LGTM.

Comment on lines -169 to +168
if driver._is_cuda_python():
version = runtime.runtimeGetVersion()
else:
version = _cuda_hip_version
if (
not _use_ptx and version >= 11010
not _use_ptx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI. I think this change is OK. For a more robust treatment for _get_arch(), I suggest to take a look at this helper function that I added recently. We don't have to do it in this PR or even in v13, since right now we almost always generate SASS which is problem free. But it is possible that the nvcc/nvrtc version is newer than the driver version, and the generated PTX would not be loadable. (I hit this when updating the PTX tests in test_raw.py, which prompted me to write that helper 🙂)

@kmaehashi kmaehashi requested a review from takagi November 8, 2023 09:18
@kmaehashi
Copy link
Member Author

/test mini

@takagi
Copy link
Member

takagi commented Nov 10, 2023

Would you have a look at cuda-python test failure?

CUDA Python's getLocalRuntimeVersion does not support CUDA 11.x or Windows
@kmaehashi
Copy link
Member Author

/test mini

@takagi
Copy link
Member

takagi commented Nov 13, 2023

Do we also need to include include '_runtime_extern.pxi' when we're using CUDA Python? And, cupy.linux.cuda122's failure is not related.

@kmaehashi
Copy link
Member Author

Do we also need to include include '_runtime_extern.pxi' when we're using CUDA Python? And, cupy.linux.cuda122's failure is not related.

Thanks, that's right! Fixed.

/test mini

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take a final look later today. Reminder to add a CI test: #7954 (comment)

@takagi
Copy link
Member

takagi commented Nov 14, 2023

/test mini

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work, LGTM except for one question: I see the version check for JIT was reverted (7a4a4bb), is there any particular reason? I would think checking the header version is safer, instead of using libcuda.so as a proxy. But this is not a big showstopper, we can follow-up later.

See also my above reminder.

@takagi takagi added the st:awaiting-author Awaiting response from author label Nov 15, 2023
@kmaehashi
Copy link
Member Author

I can take a final look later today. Reminder to add a CI test: #7954 (comment)

Thanks for the reminder! Added.

@kmaehashi
Copy link
Member Author

I see the version check for JIT was reverted (7a4a4bb), is there any particular reason? I would think checking the header version is safer, instead of using libcuda.so as a proxy. But this is not a big showstopper, we can follow-up later.

The check has been rewritten using _getLocalRuntimeVersion() (e3f5873). This is because (1) I thought Python-level error checking (as originally implemented) is more user-friendly, and (2) I noticed CUDA_VERSION requires cuda.h which cannot be compiled with NVRTC.

@kmaehashi
Copy link
Member Author

I can take a final look later today. Reminder to add a CI test: #7954 (comment)

Thanks for the reminder! Added.

I noticed that we can't test this PR like other modules, because cupy_backends.cuda.api.runtime uses static link and is not lazy-imported. After everything is done I'll add more sophisticated test, e.g., try installing/importing the generated CuPy binary within a vanilla Python docker image.

@leofang
Copy link
Member

leofang commented Nov 15, 2023

(2) I noticed CUDA_VERSION requires cuda.h which cannot be compiled with NVRTC.

This is new to me, thanks for sharing.

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, merge?

@leofang
Copy link
Member

leofang commented Nov 15, 2023

I noticed that we can't test this PR like other modules, because cupy_backends.cuda.api.runtime uses static link and is not lazy-imported. After everything is done I'll add more sophisticated test, e.g., try installing/importing the generated CuPy binary within a vanilla Python docker image.

Another way to test it (but only on Linux) is to do LD_DEBUG=libs python -c "import cupy", parse the stdout, and check if libcudart.so is loaded. It's a bit hacky, though.

Copy link
Member

@takagi takagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@takagi takagi merged commit cf9b4a5 into cupy:main Nov 16, 2023
55 of 56 checks passed
@takagi takagi removed the st:awaiting-author Awaiting response from author label Nov 16, 2023
@kmaehashi kmaehashi deleted the static-link-cudart branch November 16, 2023 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocking Issue/pull-request is mandatory for the upcoming release cat:enhancement Improvements to existing features no-compat Disrupts backward compatibility prio:high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants