Skip to content

[AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly #99687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 27, 2025

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jul 19, 2024

Summary:
The getToolChain pass uses the triple to determine which toolchain to
create. Currently the amdgcn-amd-amdhsa triple maps to the
ROCmToolChain which uses things expected to be provided by ROCm.
This is neded for OpenCL, but directly targeting C++ does not want this
since it's primarily being used for creating GPU runtime code. As far as
I know I'm the only user of this, so this shouldn't change anything.

Unfortunately, there's no good logic for detercting this, so I simply
checked ahead of time if the input is either foo.cl or -x cl foo.c
to choose between the two. This allows us to use the AMDGPU target
normally, as otherwise it will error without passing -nogpulib.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jul 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

Changes

Summary:
The getToolChain pass uses the triple to determine which toolchain to
create. Currently the amdgcn-amd-amdhsa triple maps to the
ROCmToolChain which uses things expected to be provided by ROCm.
This is neded for OpenCL, but directly targeting C++ does not want this
since it's primarily being used for creating GPU runtime code. As far as
I know I'm the only user of this, so this shouldn't change anything.

Unfortunately, there's no good logic for detercting this, so I simply
checked ahead of time if the input is either foo.cl or -x cl foo.c
to choose between the two. This allows us to use the AMDGPU target
normally, as otherwise it will error without passing -nogpulib.


Full diff: https://github.com/llvm/llvm-project/pull/99687.diff

1 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+13-2)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 8e44d5afa40e0..7bd78846766b8 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -6404,9 +6404,20 @@ const ToolChain &Driver::getToolChain(const ArgList &Args,
     case llvm::Triple::CUDA:
       TC = std::make_unique<toolchains::NVPTXToolChain>(*this, Target, Args);
       break;
-    case llvm::Triple::AMDHSA:
-      TC = std::make_unique<toolchains::ROCMToolChain>(*this, Target, Args);
+    case llvm::Triple::AMDHSA: {
+      bool IsOpenCL = llvm::any_of(
+          Args.filtered(options::OPT_INPUT, options::OPT_x), [](auto &Arg) {
+            if (Arg->getOption().matches(options::OPT_INPUT))
+              return StringRef(Arg->getValue()).ends_with(".cl");
+            return StringRef(Arg->getValue()).ends_with("cl");
+          });
+      TC =
+          IsOpenCL
+              ? std::make_unique<toolchains::ROCMToolChain>(*this, Target, Args)
+              : std::make_unique<toolchains::AMDGPUToolChain>(*this, Target,
+                                                              Args);
       break;
+    }
     case llvm::Triple::AMDPAL:
     case llvm::Triple::Mesa3D:
       TC = std::make_unique<toolchains::AMDGPUToolChain>(*this, Target, Args);

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs test. I also would assume you shouldn't have to reinvent language detection code

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jul 19, 2024

Needs test. I also would assume you shouldn't have to reinvent language detection code

There's an existing function for getting the inputs, but it takes the ToolChain as input so it makes a circular dependency. The implementation doesn't seem to need that much from the ToolChain so we could possibly factor it out?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jul 19, 2024

Also, apparently there's some driver tests that expect --target=amdgcn-amd-amdhsa-opencl as the environment type. Is that the expected way to specify OpenCL? I'm not overly familiar.

If that's the case then I can drop the toolchain detection hacks.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jul 19, 2024

There's a test that's .cl that doesn't pass -opencl so probably can't rely on it.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jul 19, 2024

Okay, so the only user of the toolchain is Darwin. So I'd need to somehow rework that logic to make it work.

@arsenm
Copy link
Contributor

arsenm commented Jul 22, 2024

Also, apparently there's some driver tests that expect --target=amdgcn-amd-amdhsa-opencl as the environment type. Is that the expected way to specify OpenCL? I'm not overly familiar.

That was a mistake from long ago we shouldn't continue. A language is not an environment

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jul 22, 2024

Also, apparently there's some driver tests that expect --target=amdgcn-amd-amdhsa-opencl as the environment type. Is that the expected way to specify OpenCL? I'm not overly familiar.

That was a mistake from long ago we shouldn't continue. A language is not an environment

Alright, it works regardless but I should probably keep the check in there just in case.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Aug 5, 2024

Ping

@yxsamliu
Copy link
Collaborator

yxsamliu commented Aug 7, 2024

I feel choosing toolchain based on input files does not solve all the use cases.

You may want to handle the object files, bitcodes, or assembly files differently by using different toolchains, e.g. you may want to choose rocm toolchain or amdgpu toolchain or HIPAMD toolchain to hand the object file, even though it has the same extension. Although you could introduce -x for different object type, e.g. cl_obj, cxx_obj, or hip_obj, but that is quite cubbersome.

I am wondering whether we should introduce a clang option which allows us to directly choose a toolchain.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Aug 7, 2024

I feel choosing toolchain based on input files does not solve all the use cases.

You may want to handle the object files, bitcodes, or assembly files differently by using different toolchains, e.g. you may want to choose rocm toolchain or amdgpu toolchain or HIPAMD toolchain to hand the object file, even though it has the same extension. Although you could introduce -x for different object type, e.g. cl_obj, cxx_obj, or hip_obj, but that is quite cubbersome.

I am wondering whether we should introduce a clang option which allows us to directly choose a toolchain.

I think the main difference here is in the compile phase, which wants to load the ROCm libraries. They're probably identical after that point since the only differences exist inside the addClangTargetOptions, so after clang it should be just feeding it through the pipeline / ld.lld.

It's an inconvenience for me that targeting C++ is held hostage by OpenCL so if I ever do --target=amdgcn-amd-amdhsa I get errors about not finding the ROCm libraries even though we shouldn't use them by default here.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Aug 27, 2024

ping

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 13, 2024

So the main reason I want this is so that I don't need to keep passing -nogpulib because targeting plain C/C++ should not require the ROCm device libs.

TC =
llvm::any_of(Inputs,
[](auto &Input) { return types::isOpenCL(Input.first); })
? std::make_unique<toolchains::ROCMToolChain>(*this, Target, Args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If input files have mixed OpenCL and C/C++ programs, ROCMToolChain will be chosen. Is ROCmToolChain is able to compile C/C++ programs like AMDGPUToolChain, or it will cause errors? Should we emit diagnostic message that mixed OpenCL and C/C++ programs are not supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the mixed case, all of them will be the ROCMToolChain, which I'm okay with. ROCMToolChain is what currently is invoked, the main difference is that it includes a lot of stuff in the ROCMToolChain while we want straight C/C++ to be freestanding.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same C/C++ source file will be compiled differently, depending on whether they are compiled with OpenCL programs. This may cause confusion to the users. Can we let ROCMToolChain behave the same as AMDGPUToolChain for C/C++ programs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Toolchain doesn't have source file information, I could put it in the constructor but it would be pretty much the same thing. I can't disable the ROCm device libs because that would then break OpenCL. I could make it an error instead, but I figured the differences are pretty minor so it's not worthwhile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that using the ROCm toolchain is more permissive here. I don't think it's super relevant that we emit an error in this case.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 20, 2024

Ping

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change the behavior with amdgcn-- or amdgcn-mesa-mesa3d?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Dec 3, 2024

Does this change the behavior with amdgcn-- or amdgcn-mesa-mesa3d?

Those both use the AMDGPUToolChain. I suppose you could make the argument that --target=amdgcn-- is the intended target for "standalone amd" stuff.

@arsenm
Copy link
Contributor

arsenm commented Dec 3, 2024

Does this change the behavior with amdgcn-- or amdgcn-mesa-mesa3d?

Those both use the AMDGPUToolChain. I suppose you could make the argument that --target=amdgcn-- is the intended target for "standalone amd" stuff.

Mostly I'm just wondering if it breaks whatever clover is doing.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Dec 3, 2024

Does this change the behavior with amdgcn-- or amdgcn-mesa-mesa3d?

Those both use the AMDGPUToolChain. I suppose you could make the argument that --target=amdgcn-- is the intended target for "standalone amd" stuff.

Mostly I'm just wondering if it breaks whatever clover is doing.

Don't think so, should be handled here

case llvm::Triple::AMDPAL:
.

@arsenm
Copy link
Contributor

arsenm commented Dec 3, 2024

I skimmed clover and I'm not sure it's even using the clang driver. It seems to be doing its own thing and directly using CompilerInstance

@jhuber6
Copy link
Contributor Author

jhuber6 commented Dec 3, 2024

it's also tough to say it's legal to drop the amdhsa in this case, because we rely on builtins that refer to things like implicit arguments, which are from HSA. Otherwise I would probably be fine just calling it amdgcn.

Summary:
The getToolChain pass uses the triple to determine which toolchain to
create. Currently the amdgcn-amd-amdhsa triple maps to the
ROCmToolChain which uses things expected to be provided by ROCm.
This is neded for OpenCL, but directly targeting C++ does not want this
since it's primarily being used for creating GPU runtime code. As far as
I know I'm the only user of this, so this shouldn't change anything.

Unfortunately, there's no good logic for detercting this, so I simply
checked ahead of time if the input is either foo.cl or -x cl foo.c
to choose between the two. This allows us to use the AMDGPU target
normally, as otherwise it will error without passing -nogpulib.
@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 24, 2025

Updated this to be simpler. This logic is only run when compiled from clang --target=amdgcn-amd-amdhsa, which means it will only show up for OpenCL and direct C++ targeting, HIP uses a different method to get the Toolchain. For regular C++ I do not want to depend on the ROCm tools, it should be freestanding compiler fare without the extra platform support.

This technically does have the effect that if someone does clang -x cl foo.c -x c bar.c --target=amdgcn-amd-amdhsa it will include the device libs for both, but I do not think that's an issue, since by using OpenCL you're already opting into it.

Any chance of this landing before the merge? This primarily allows letting CMake use the AMDGPU as a freestanding target without EVERY single compile option check failing with "Could not find ROCm toolchain".

@jhuber6 jhuber6 requested a review from jplehr January 27, 2025 14:21
Copy link
Collaborator

@JonChesterfield JonChesterfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right thing. amdgpu is completely usable without rocm libraries and already has out of tree users doing that. Needing to manually opt out of rocm libs when not using any of rocm is definitely annoying, especially in contexts like godbolt where one is just trying to look at the emitted asm.

The code is ugly but that's necessary to preserve opencl's behaviour, agreed that it's not worth trying to change that.

@jhuber6 jhuber6 merged commit 6ff86f2 into llvm:main Jan 27, 2025
8 checks passed
@jhuber6 jhuber6 deleted the Direct branch January 27, 2025 16:36
@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 28, 2025

Unfortunately, this breaks SPIR-V JITing, and probably cases where the input is .bc that came from an upstream that has calls to ROCDL symbols (e.g. HIP compiled to .bc). I appreciate the enthusiasm, and the spirit is possibly in the right place, but it would be ideal if we spent a bit more time considering the implications of these types of changes. Is it at all possible to revert and resubmit once this is sorted @jhuber6?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 28, 2025

Unfortunately, this breaks SPIR-V JITing, and probably cases where the input is .bc that came from an upstream that has calls to ROCDL symbols (e.g. HIP compiled to .bc). I appreciate the enthusiasm, and the spirit is possibly in the right place, but it would be ideal if we spent a bit more time considering the implications of these types of changes. Is it at all possible to revert and resubmit once this is sorted @jhuber6?

This PR is six months old. I'm surprised that this breaks uses of the ROCm-DLs because those are normally baked-in per-TU via the -mlink-builtin-bitcode flag. I wasn't aware that we added the ROCm device libraries to .bc files, it doesn't seem to do that with the patch reverted. Do you have a specific example of the clang invocation?

@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 28, 2025

It is six months old, but it is relatively difficult to keep track of all the concurrent non-communicating swimlanes - I do apologise for not having done so though! It is possible that we are microfocusing on a potential use case, but there are other valid ones. For example, ignoring the SPIR-V angle, one could emit IR from a HIP app compiled with -nogpulib. HIP headers forward various functions to ROCDL symbols, which means that the IR will include calls to ROCDL symbols. It is valid to then take said IR as input, and "finalise" it for AMDGPU, linking the device libs in the process. In this case you cannot disambiguate based on the source language / file extension. An example of an invocation that ends up here would be the following:

clang -O3 -x ir -v foo.bc -mcode-object-version=5 --save-temps -fcolor-diagnostics -fexceptions -fvisibility=hidden -fconvergent-functions -ffp-contract=fast-honor-pragmas -fno-experimental-relative-c++-abi-vtables -O3 -fno-autolink -finline-functions -o foo.co -target amdgcn-amd-amdhsa -mcpu=gfx1030

# which expands into

clang-20 -cc1 -mcode-object-version=5 -mllvm --amdhsa-code-object-version=5 -triple amdgcn-amd-amdhsa -Werror=atomic-alignment -S -dumpdir foo.co- -save-temps=cwd -disable-free -clear-ast-before-backend -main-file-name foo.bc -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=none -fno-experimental-relative-c++-abi-vtables -ffp-contract=fast-honor-pragmas -fno-rounding-math -mconstructor-aliases -target-cpu gfx1030 -debugger-tuning=gdb -fdebug-compilation-dir=/something/ -v -fconvergent-functions -resource-dir /llvm-project/build/lib/clang/20 -O3 -fno-experimental-relative-c++-abi-vtables -fno-autolink -ferror-limit 19 -fvisibility=hidden -fgnuc-version=4.2.1 -finline-functions -fskip-odr-check-in-gmf -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o foo.s -x ir foo.bc

# then

clang-20 -cc1as -mllvm --amdhsa-code-object-version=5 -triple amdgcn-amd-amdhsa -filetype obj -main-file-name foo.bc -target-cpu gfx1030 -fdebug-compilation-dir=/something/ -dwarf-version=5 -mrelocation-model pic -o foo.s

# finally
ld.lld --no-undefined -shared foo.o -plugin-opt=mcpu=gfx1030 -o foo.co
ld.lld: error: undefined symbol: __ockl_fprintf_stderr_begin
>>> referenced by 3143359427648554286.spv.o:(__assert_fail)
>>> referenced by 3143359427648554286.spv.o:(__assert_fail)

In the future, it would also be somewhat helpful, if possible, to advertise what amounts to flipping a long-established default, even when said default is dubious.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 28, 2025

It is six months old, but it is relatively difficult to keep track of all the concurrent non-communicating swimlanes - I do apologise for not having done so though! It is possible that we are microfocusing on a potential use case, but there are other valid ones. For example, ignoring the SPIR-V angle, one could emit IR from a HIP app compiled with -nogpulib. HIP headers forward various functions to ROCDL symbols, which means that the IR will include calls to ROCDL symbols. It is valid to then take said IR as input, and "finalise" it for AMDGPU, linking the device libs in the process. In this case you cannot disambiguate based on the source language / file extension. An example of an invocation that ends up here would be the following:

clang -O3 -x ir -v foo.bc -mcode-object-version=5 --save-temps -fcolor-diagnostics -fexceptions -fvisibility=hidden -fconvergent-functions -ffp-contract=fast-honor-pragmas -fno-experimental-relative-c++-abi-vtables -O3 -fno-autolink -finline-functions -o foo.co -target amdgcn-amd-amdhsa -mcpu=gfx1030

Okay, so this case definitely tries to link the ROCm bitcode via -mlink-builtin-bitcode and this patch changes that. I would personally say that this shouldn't be the default behavior because the ROCm device libs are very language specific things and we shouldn't be encoding that into plain LLVM-IR. If someone does -nogpulib I don't think it's our responsibility to fix things up for them. People can always do -Xclang -mlink-builtin-bitcode -Xclang /opt/rocm/amdgcn/bitcode/ockl.bc even if it is extremely painful.

If there's an existing use-case that this breaks, I can revert it. Long term however it would be nice if the ROCm device libraries could be pulled into a more friendly library format. Then you could just pass it to ld.lld and everything's fine. Generally my understanding is that the ROCm DL's should be fixed up by the language front-end, so if anything makes it to the IR level the user did something non-standard, and I don't think that's our fault considering they can still do it manually if needed.

In the future, it would also be somewhat helpful, if possible, to advertise what amounts to flipping a long-established default, even when said default is dubious.

I need to determine the optimal graph of AMD engineers to include in the reviewers list of patches like these.

@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 28, 2025

It is six months old, but it is relatively difficult to keep track of all the concurrent non-communicating swimlanes - I do apologise for not having done so though! It is possible that we are microfocusing on a potential use case, but there are other valid ones. For example, ignoring the SPIR-V angle, one could emit IR from a HIP app compiled with -nogpulib. HIP headers forward various functions to ROCDL symbols, which means that the IR will include calls to ROCDL symbols. It is valid to then take said IR as input, and "finalise" it for AMDGPU, linking the device libs in the process. In this case you cannot disambiguate based on the source language / file extension. An example of an invocation that ends up here would be the following:

clang -O3 -x ir -v foo.bc -mcode-object-version=5 --save-temps -fcolor-diagnostics -fexceptions -fvisibility=hidden -fconvergent-functions -ffp-contract=fast-honor-pragmas -fno-experimental-relative-c++-abi-vtables -O3 -fno-autolink -finline-functions -o foo.co -target amdgcn-amd-amdhsa -mcpu=gfx1030

Okay, so this case definitely tries to link the ROCm bitcode via -mlink-builtin-bitcode and this patch changes that. I would personally say that this shouldn't be the default behavior because the ROCm device libs are very language specific things and we shouldn't be encoding that into plain LLVM-IR. If someone does -nogpulib I don't think it's our responsibility to fix things up for them. People can always do -Xclang -mlink-builtin-bitcode -Xclang /opt/rocm/amdgcn/bitcode/ockl.bc even if it is extremely painful.

If there's an existing use-case that this breaks, I can revert it. Long term however it would be nice if the ROCm device libraries could be pulled into a more friendly library format. Then you could just pass it to ld.lld and everything's fine. Generally my understanding is that the ROCm DL's should be fixed up by the language front-end, so if anything makes it to the IR level the user did something non-standard, and I don't think that's our fault considering they can still do it manually if needed.

In the future, it would also be somewhat helpful, if possible, to advertise what amounts to flipping a long-established default, even when said default is dubious.

I need to determine the optimal graph of AMD engineers to include in the reviewers list of patches like these.

I outlined an existing, as far as we (AMD) are concerned, use case, so if it'd be possible to revert, it'd be amazing! I suspect there's more than this, though, and the "we shouldn't fix X" is extremely uncharitable, even if morally sound. Perhaps this entire topic warrants a bit more discussion and design before we reland?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 28, 2025

I outlined an existing, as far as we (AMD) are concerned, use case, so if it'd be possible to revert, it'd be amazing! I suspect there's more than this, though, and the "we shouldn't fix X" is extremely uncharitable, even if morally sound. Perhaps this entire topic warrants a bit more discussion and design before we reland?

I was really hoping that this would make it into the reluctant to revert it since it'll result in clang-20 being considerably less usable for straight C/C++ on AMDGPU. I was mostly wondering if we knew anyone that did this in the wild or if this broke testing. The problem is that if we don't break this behavior then it's pretty much impossible to decouple ROCm's special magic with just using the AMD toolchain normally without passing -nogpulib to everything, which makes it considerable less usable from CMake. So, reverting this patch would basically be either between breaking this behavior later, or not at all.

@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 28, 2025

I outlined an existing, as far as we (AMD) are concerned, use case, so if it'd be possible to revert, it'd be amazing! I suspect there's more than this, though, and the "we shouldn't fix X" is extremely uncharitable, even if morally sound. Perhaps this entire topic warrants a bit more discussion and design before we reland?

I was really hoping that this would make it into the reluctant to revert it since it'll result in clang-20 being considerably less usable for straight C/C++ on AMDGPU. I was mostly wondering if we knew anyone that did this in the wild or if this broke testing. The problem is that if we don't break this behavior then it's pretty much impossible to decouple ROCm's special magic with just using the AMD toolchain normally without passing -nogpulib to everything, which makes it considerable less usable from CMake. So, reverting this patch would basically be either between breaking this behavior later, or not at all.

I am explicitly saying that we know of this being done in something that we care about. So please revert, and we can see when and how to break things.

jhuber6 added a commit that referenced this pull request Jan 28, 2025
#99687)"

Breaks some downstream AMD stuff apparently.
This reverts commit 6ff86f2.
@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 28, 2025

Many thanks, this is highly appreciated; let's try to coordinate internally.

jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 4, 2025
…ly (llvm#99687)"

Summary:
This reverts commit 740e6ae.

After discussions it was determined that the behavior for IR inputs
needs to be maintained at least for now. In the future we should put a
deprecation notice on this behavior. This patch keeps the old beahvior
for OpenCL and IR inputs, while others will be standalone. This is good
enough for standard compile flows.
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 4, 2025
…ly (llvm#99687)"

Summary:
This reverts commit 740e6ae.

After discussions it was determined that the behavior for IR inputs
needs to be maintained at least for now. In the future we should put a
deprecation notice on this behavior. This patch keeps the old beahvior
for OpenCL and IR inputs, while others will be standalone. This is good
enough for standard compile flows.
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 5, 2025
…ly (llvm#99687)"

Summary:
This reverts commit 740e6ae.

After discussions it was determined that the behavior for IR inputs
needs to be maintained at least for now. In the future we should put a
deprecation notice on this behavior. This patch keeps the old beahvior
for OpenCL and IR inputs, while others will be standalone. This is good
enough for standard compile flows.
jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Feb 6, 2025
…ly (llvm#99687)"

Summary:
This reverts commit 740e6ae.

After discussions it was determined that the behavior for IR inputs
needs to be maintained at least for now. In the future we should put a
deprecation notice on this behavior. This patch keeps the old beahvior
for OpenCL and IR inputs, while others will be standalone. This is good
enough for standard compile flows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants