Static link CUDA Runtime #7954

kmaehashi · 2023-10-21T12:31:58Z

This is a part of #7620.

Static link CUDA Runtime in each CuPy moduels (Runtime, Jitify, cuRAND, Thrust)
Lazy import Jitify
Removes cupy.cuda.profiler.initialize() function, which was removed in CUDA 12 and marked deprecated since CuPy v12 (CUDA 12: Deprecate (or remove) cudaProfilerInitialize #7332). Code search yields no use-case for this function.

kmaehashi · 2023-10-21T12:55:29Z

/test mini

kmaehashi · 2023-10-22T02:45:35Z

/test mini

kmaehashi · 2023-10-22T02:57:52Z

@leofang
This code relies on locally-installed CUDA Toolkit version (specifically cooperative_groups.h). Wondering how we can rewrite this...

cupy/cupyx/jit/cg.py

Lines 102 to 103 in 1ef4036

    
           if _runtime.runtimeGetVersion() < 11060: 
        
               raise RuntimeError("block_rank() is supported on CUDA 11.6+")

kmaehashi · 2023-10-22T04:31:59Z

This code relies on locally-installed CUDA Toolkit version (specifically cooperative_groups.h).

Fixed in e89e35a. Could you review this commit @asi1024?

leofang · 2023-10-23T01:01:18Z

Yes it should be a compile time error. LGTM.

leofang · 2023-10-23T03:10:15Z

cupy/cuda/compiler.py

-    if driver._is_cuda_python():
-        version = runtime.runtimeGetVersion()
-    else:
-        version = _cuda_hip_version
    if (
-        not _use_ptx and version >= 11010
+        not _use_ptx


FYI. I think this change is OK. For a more robust treatment for _get_arch(), I suggest to take a look at this helper function that I added recently. We don't have to do it in this PR or even in v13, since right now we almost always generate SASS which is problem free. But it is possible that the nvcc/nvrtc version is newer than the driver version, and the generated PTX would not be loadable. (I hit this when updating the PTX tests in test_raw.py, which prompted me to write that helper 🙂)

This reverts commit e89e35a.

kmaehashi · 2023-11-08T15:50:14Z

/test mini

takagi · 2023-11-10T04:28:26Z

Would you have a look at cuda-python test failure?

CUDA Python's getLocalRuntimeVersion does not support CUDA 11.x or Windows

kmaehashi · 2023-11-10T16:53:29Z

/test mini

takagi · 2023-11-13T06:05:03Z

Do we also need to include include '_runtime_extern.pxi' when we're using CUDA Python? And, cupy.linux.cuda122's failure is not related.

kmaehashi · 2023-11-13T11:40:55Z

Do we also need to include include '_runtime_extern.pxi' when we're using CUDA Python? And, cupy.linux.cuda122's failure is not related.

Thanks, that's right! Fixed.

/test mini

leofang

I can take a final look later today. Reminder to add a CI test: #7954 (comment)

takagi · 2023-11-14T09:19:02Z

/test mini

leofang

Thanks for the work, LGTM except for one question: I see the version check for JIT was reverted (7a4a4bb), is there any particular reason? I would think checking the header version is safer, instead of using libcuda.so as a proxy. But this is not a big showstopper, we can follow-up later.

See also my above reminder.

kmaehashi · 2023-11-15T15:34:42Z

I can take a final look later today. Reminder to add a CI test: #7954 (comment)

Thanks for the reminder! Added.

kmaehashi · 2023-11-15T15:38:00Z

I see the version check for JIT was reverted (7a4a4bb), is there any particular reason? I would think checking the header version is safer, instead of using libcuda.so as a proxy. But this is not a big showstopper, we can follow-up later.

The check has been rewritten using _getLocalRuntimeVersion() (e3f5873). This is because (1) I thought Python-level error checking (as originally implemented) is more user-friendly, and (2) I noticed CUDA_VERSION requires cuda.h which cannot be compiled with NVRTC.

kmaehashi · 2023-11-15T15:49:36Z

I can take a final look later today. Reminder to add a CI test: #7954 (comment)

Thanks for the reminder! Added.

I noticed that we can't test this PR like other modules, because cupy_backends.cuda.api.runtime uses static link and is not lazy-imported. After everything is done I'll add more sophisticated test, e.g., try installing/importing the generated CuPy binary within a vanilla Python docker image.

leofang · 2023-11-15T16:14:16Z

(2) I noticed CUDA_VERSION requires cuda.h which cannot be compiled with NVRTC.

This is new to me, thanks for sharing.

leofang

LGTM, merge?

leofang · 2023-11-15T16:17:04Z

I noticed that we can't test this PR like other modules, because cupy_backends.cuda.api.runtime uses static link and is not lazy-imported. After everything is done I'll add more sophisticated test, e.g., try installing/importing the generated CuPy binary within a vanilla Python docker image.

Another way to test it (but only on Linux) is to do LD_DEBUG=libs python -c "import cupy", parse the stdout, and check if libcudart.so is loaded. It's a bit hacky, though.

takagi

LGTM!

kmaehashi added 9 commits October 21, 2023 12:19

implement runtime._getCUDAMajorVersion()

c3b276c

remove or replace runtime.runtimeGetVersion()

0922be2

fix unused import

9669e60

static link CUDA Runtime

4055338

merge profiler module to runtime

75465c8

remove use of profiler module

3d29eb4

update profile module for backward compatibility

4cd14a6

update docs

7b4cece

static link CUDA Runtime in Jitify, cuRAND, Thrust

1629883

This was referenced Oct 21, 2023

Merge profiler module into runtime #7930

Closed

Soft link CUDA Runtime & lazy import Jitify #7929

Closed

kmaehashi added cat:enhancement Improvements to existing features no-compat Disrupts backward compatibility blocking Issue/pull-request is mandatory for the upcoming release prio:high labels Oct 21, 2023

kmaehashi mentioned this pull request Oct 21, 2023

Allow import cupy without CUDA installation #7620

Closed

kmaehashi added 2 commits October 21, 2023 12:35

lazy import Jitify

841af9e

flake8

7328623

fix libraries list

385fce3

kmaehashi force-pushed the static-link-cudart branch from 33e5a4e to 385fce3 Compare October 21, 2023 13:33

fix HIP translation

96e37c4

kmaehashi added 3 commits October 22, 2023 03:14

fix jitify import

ca5adb8

check CUDA header version in cupyx.jit

e89e35a

fix mock

3220581

leofang reviewed Oct 23, 2023

View reviewed changes

kmaehashi added 6 commits November 8, 2023 07:51

implement cupy.cuda.get_local_runtime_version()

841eff7

update tests

c693e7a

Revert "check CUDA header version in cupyx.jit"

7a4a4bb

This reverts commit e89e35a.

use _getLocalRuntimeVersion instead

e3f5873

add docs

c9f5090

fix docs

3b7b901

kmaehashi requested a review from takagi November 8, 2023 09:18

avoid using getLocalRuntimeVersion from CUDA Python

1f8f7a0

CUDA Python's getLocalRuntimeVersion does not support CUDA 11.x or Windows

use softlink for runtime regardless of CUDA Python mode

aecccf6

leofang reviewed Nov 13, 2023

View reviewed changes

leofang approved these changes Nov 14, 2023

View reviewed changes

takagi added the st:awaiting-author Awaiting response from author label Nov 15, 2023

kmaehashi force-pushed the static-link-cudart branch from b55e965 to aecccf6 Compare November 15, 2023 15:45

leofang approved these changes Nov 15, 2023

View reviewed changes

takagi approved these changes Nov 16, 2023

View reviewed changes

takagi merged commit cf9b4a5 into cupy:main Nov 16, 2023
55 of 56 checks passed

takagi removed the st:awaiting-author Awaiting response from author label Nov 16, 2023

kmaehashi deleted the static-link-cudart branch November 16, 2023 09:57

leofang mentioned this pull request Feb 5, 2024

Discussion: Change to link statically to cudart? #7727

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Static link CUDA Runtime #7954

Static link CUDA Runtime #7954

kmaehashi commented Oct 21, 2023 •

edited

Loading

kmaehashi commented Oct 21, 2023

kmaehashi commented Oct 22, 2023

kmaehashi commented Oct 22, 2023

kmaehashi commented Oct 22, 2023

leofang commented Oct 23, 2023

leofang Oct 23, 2023

kmaehashi commented Nov 8, 2023

takagi commented Nov 10, 2023

kmaehashi commented Nov 10, 2023

takagi commented Nov 13, 2023

kmaehashi commented Nov 13, 2023

leofang left a comment

takagi commented Nov 14, 2023

leofang left a comment

kmaehashi commented Nov 15, 2023

kmaehashi commented Nov 15, 2023

kmaehashi commented Nov 15, 2023

leofang commented Nov 15, 2023

leofang left a comment

leofang commented Nov 15, 2023

takagi left a comment

Static link CUDA Runtime #7954

Static link CUDA Runtime #7954

Conversation

kmaehashi commented Oct 21, 2023 • edited Loading

kmaehashi commented Oct 21, 2023

kmaehashi commented Oct 22, 2023

kmaehashi commented Oct 22, 2023

kmaehashi commented Oct 22, 2023

leofang commented Oct 23, 2023

leofang Oct 23, 2023

Choose a reason for hiding this comment

kmaehashi commented Nov 8, 2023

takagi commented Nov 10, 2023

kmaehashi commented Nov 10, 2023

takagi commented Nov 13, 2023

kmaehashi commented Nov 13, 2023

leofang left a comment

Choose a reason for hiding this comment

takagi commented Nov 14, 2023

leofang left a comment

Choose a reason for hiding this comment

kmaehashi commented Nov 15, 2023

kmaehashi commented Nov 15, 2023

kmaehashi commented Nov 15, 2023

leofang commented Nov 15, 2023

leofang left a comment

Choose a reason for hiding this comment

leofang commented Nov 15, 2023

takagi left a comment

Choose a reason for hiding this comment

kmaehashi commented Oct 21, 2023 •

edited

Loading