Regression when using AMDGPU and CUDA in same session

When loading AMDGPU.jl and CUDA.jl (in that order) on Julia master breaks usage of AMDGPU. It appears that when AMDGPU calls into GPUCompiler's `cached_compilation` function, and various GPUCompiler interface functions are called (which have overloads defined in AMDGPU), those interface function calls fall back to their non-specialized base-case method defined in GPUCompiler. Specifically, the call:

`GPUCompiler.runtime_module(job)`

where `job` is a:

`GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.ROCCompilerParams}`

does not call the method https://github.com/JuliaGPU/AMDGPU.jl/blob/bf66efcc5541319ae44d3ff45669abebb133d421/src/compiler.jl#L59, but instead calls https://github.com/JuliaGPU/GPUCompiler.jl/blob/18163a561a169934844d493e4fcd3238439c3965/src/interface.jl#L188.

I can use reflection within GPUCompiler to see that AMDGPU's method does in fact exist and matches the signature for `runtime_module` called with `job`.

If CUDA is not loaded, or before CUDA is loaded, the GPUCompiler overloads in AMDGPU are called correctly. Loading CUDA before AMDGPU also avoids the problem.

`git bisect` points to:

```
fbd5a72b0a13223c0f67b9d2baa4428e7db199a5 is the first bad commit
commit fbd5a72b0a13223c0f67b9d2baa4428e7db199a5
Author: Jameson Nash <vtjnash@gmail.com>
Date:   Wed Oct 5 12:00:56 2022 -0400

    precompile: serialize the full edges graph (#46920)

    Previously, we would flatten the edges graph during serialization, to
    simplify the deserialization codes, but that now was adding complexity
    and confusion and uncertainty to the code paths. Clean that all up, so
    that we do not do that. Validation is performed while they are represented
    as forward edges, so avoids needing to interact with backedges at all.
```

The commit immediately prior does not exhibit the problematic behavior.

Currently the failing AMDGPU code paths can't be exercised without access to a supported AMD GPU; I am going to implement a way to call the desired methods without hardware access, as well as an MWE, to make it easier to reproduce.

@vtjnash 

Ref https://github.com/JuliaLang/julia/pull/46920

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Regression when using AMDGPU and CUDA in same session #47331

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Regression when using AMDGPU and CUDA in same session #47331

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions