Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for libc++ std modules #425

Merged
merged 7 commits into from
May 2, 2024
Merged

Support for libc++ std modules #425

merged 7 commits into from
May 2, 2024

Conversation

mstorsjo
Copy link
Owner

@mstorsjo mstorsjo commented Apr 23, 2024

This is a draft for various fixes needed for using the libcxx C++23 import std modules (plus a bunch of other commits for testing it in the github actions pipelines, and for trying to speed up consecutive reruns).

Most of the changes here are to our packaging/setup overall, but there are a couple of patches needed for mingw-w64 - I'm trying to test and evaluate them before trying to upstream them.

See https://github.com/mstorsjo/llvm-mingw/actions/runs/8802495666 for artifacts with the latest round of patches (where the final artifacts might be done in a couple of hours), or https://github.com/mstorsjo/llvm-mingw/actions/runs/8799865568 for a completed set of artifacts from an earlier build (with an earlier iteration of the patches).

CC @huangqinjin @nolange.

@mstorsjo mstorsjo force-pushed the libcxx-modules branch 2 times, most recently from 93d3fe7 to e2fd7f0 Compare April 26, 2024 19:39
@mstorsjo
Copy link
Owner Author

The necessary patches have been merged upstream, in llvm-project and mingw-w64 (and backported to the 18.x release branch of LLVM), so this should work now. If you're interested, have a look - I'll potentially merge these commits (except for the ones marked WIP, which are included only for testing) so they'd be included with the new llvm-mingw release for the LLVM 18.1.5 point release next week.

@cristianadam
Copy link

With modules in LLVM-MinGW we should be able to get more libraries ported https://arewemodulesyet.org/ 😀

@mstorsjo
Copy link
Owner Author

With modules in LLVM-MinGW we should be able to get more libraries ported https://arewemodulesyet.org/ 😀

:-)

Note that there are known issues with C++ module interfaces (not specifically to the C++23 std module) in e.g. #421.

DEFAULT_TARGET=x86_64-w64-mingw32
if [ "$TARGET" = "$BASENAME" ]; then
TARGET=$DEFAULT_TARGET
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clang-scan-deps itself is target irrelevant (like other llvm-* tools), the arguments after -- must be the actual command line to generate object files. In case we are using unprefixed compiler and TARGET here is used for deps scanning, we may get wrong results if the actual default target of the compiler is not TARGET.
So I think if we are using unprefixed compiler, we should not add -target for deps scanning.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent here, is that if clang-scan-deps-wrapper.sh is invoked, it means that you want whatever behaviours that the wrapper implies - i.e. defaulting to a mingw target.

On unix, if you'd invoke e.g. x86_64-w64-mingw32-clang-scan-deps -- clang ..., we would invoke clang-scan-deps-real -- clang -target x86_64-w64-mingw32 ..., which I think is a reasonable result. If the user did provide a custom -target option on the command line, it will come after the -target we injected, so the user provided option should take effect.

If you invoke x86_64-w64-mingw32-clang-scan-deps -- armv7-w64-mingw32-clang ..., then we end up invoking clang-scan-deps-real -- armv7-w64-mingw32-clang -target armv7-w64-mingw32, which probably is the most accurate interpretation of that situation.

One specific corner case regarding defaults here, is that I've tried to design the wrapper setup, so that you could use any clang executables with any defaults, e.g. the official releases that default to targeting MSVC. If you, on Windows, invoke clang-scan-deps -- clang ..., then clang-scan-deps is a wrapper too, which will default to injecting -target <arch>-w64-mingw32. This makes sure it works even if clang-scan-deps-real.exe is a build that defaults to MSVC targets.

On Windows, clang-scan-deps.exe is an executable wrapper, and the real clang binary is clang-scan-deps-real.exe. On Unix, we leave clang-scan-deps as it was, as the original vanilla binary, so the case of default implicit target shouldn't really matter.

Copy link

@huangqinjin huangqinjin Apr 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On unix, if you'd invoke e.g. x86_64-w64-mingw32-clang-scan-deps -- clang ..., we would invoke clang-scan-deps-real -- clang -target x86_64-w64-mingw32 ...

We can see how cmake invoking clang-scan-deps here:
https://github.com/Kitware/CMake/blob/v3.29.0/Modules/Compiler/Clang-CXX.cmake#L49-L56
The arguments after -- is almost same as how cmake produce the object file:
https://github.com/Kitware/CMake/blob/v3.29.0/Modules/CMakeCXXInformation.cmake#L284

So basically, we first invoke x86_64-w64-mingw32-clang-scan-deps -- clang ... to get deps to know the compiling order of all source files (no actual compiling, no object files at this phase), then we invoke exactly the same clang ... command (plus module map information) to produce object files in order.
Now the problem is we scan deps targeting x86_64-w64-mingw32, but compile the sources targeting default (on unix, it is x86_64-unknown-linux-gnu).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the problem is we scan deps targeting x86_64-w64-mingw32, but compile the sources targeting default (on unix, it is x86_64-unknown-linux-gnu).

Yes, that's true.

But normally this wouldn't happen on its own - as CMake chooses to invoke <triple>-clang-scan-deps because the compiler is called <triple>-clang or similar, right?

Now if we put it another way - if you invoke <triple>-<tool>, that generally also expects getting <triple> flavoured defaults within that tool. So it's not entirely farfetched, to expect that if you run <triple>-clang-scan-deps, that you'd want <triple> as some sort of default.

Now in this particular case, on Unix, you're right that this wouldn't match what the actual compilation command would do. But then again, why would the caller (the user, or cmake) go out of their way to call clang-scan-deps in that form?

Anyway, when the command following the -- is a plain clang, you're right that it really does do compilation for a Unix target, so injecting our own target isn't right. But then we also shouldn't be injecting -stdlib=libc++ either, since that's also specific for the mingw targets in the llvm-mingw toolchain.

Secondly, what if you're invoking x86_64-w64-mingw32-clang-scan-deps -- nonobvious-wrapper-script-clang .... We don't have such cases in llvm-mingw right now, but it's not implausible to have wrappers that aren't a plain triple prefix. In this case, the command nonobvious-wrapper-script-clang does invoke clang with some non-default settings, and we don't really know what they are. In that case, my point is that we're invoking x86_64-w64-mingw32-clang-scan-deps because we want to apply mingw style defaults, even if that's not obvious from the command after --.

In any case, what kind of behaviour to imply in this case is a bit of theoretical corner case - I think both behaviours can be considered reasonable. I can try to implement what you're suggesting, that we entirely ignore the prefix of the clang-scan-deps wrapper and only inspect the nested command.

But the other case I mentioned, where we do need to force our defaults, is in Windows builds. Because as a design feature, clang-18.exe and clang-scan-deps-real.exe can be binaries that have any defaults (they could default to a MSVC target). So when we invoke clang-scan-deps -- clang ..., where clang-scan-deps is our wrapper, we do want to force adding our default -target.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, see a29cee5 for an attempt at doing what you suggest here.

I guess this could be fine with me, too. (And it does simplify the code, so that's always nice of course.)

But as for intended behaviour, I have another hypothetical example to consider, as a thought experiment:

If we're run as x86_64-w64-mingw32-clang-scan-deps -- clang ..., we'll invoke clang-scan-deps -- clang ... with no extra -target or any extra arguments. If we're run as x86_64-w64-mingw32-clang-scan-deps -- armv7-w64-mingw32-clang ..., we'll invoke clang-scan-deps -- armv7-w64-mingw32-clang -target armv7-w64-mingw32 -stdlib=libc++ .... Those are the simple cases.

But what if we're invoked as x86_64-w64-mingw32-clang-scan-deps -- aarch64-linux-gnu-clang ..., we'll invoke clang-scan-deps -- aarch64-linux-gnu-clang -target aarch64-linux-gnu -stdlib=libc++ .... But the -stdlib=libc++ option doesn't necessarily apply here. So if we continue this thought experiment, we should maybe inspect the target triple and only apply our default -stdlib=libc++ if the target looks like a mingw target triple.

It sounds a bit far fetched - but my point is that once we're invoked via the mingw specific wrapper, it's not entirely unreasonable to assume mingw specific target behaviours anyway.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is that once we're invoked via the mingw specific wrapper, it's not entirely unreasonable to assume mingw specific target behaviours anyway.

I fully support this. My main argument here is that clang-scan-deps is actually target independent, so the prefix should not have side effect, it is just a conventional naming scheme to avoid conflicit with host clang-scan-deps.

I also support that the clang-scan-deps wrapper should have some defaults so it is consistent with clang, and is consistent within the whole toolchain. Like you pointed out specifically the windows case.

Because as a design feature, clang-18.exe and clang-scan-deps-real.exe can be binaries that have any defaults (they could default to a MSVC target). So when we invoke clang-scan-deps -- clang ..., where clang-scan-deps is our wrapper, we do want to force adding our default -target.

Out of topic: can wrappers provided by llvm-mingw be used with LLVM official build, to support mingw target build?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully support this. My main argument here is that clang-scan-deps is actually target independent, so the prefix should not have side effect, it is just a conventional naming scheme to avoid conflicit with host clang-scan-deps.

Ok, fair enough.

Anyway, my follow-up is mostly that once we remove the case where we always default to setting a mingw target by default, the -stdlib=libc++ flag stood out like a weird extra addition there. I tested another addition to it, to make that conditional like this: 3cc7cda
I'm not entirely sure which way I prefer here - I presume you prefer the state after a29cee5 at least, with or without that change?

Out of topic: can wrappers provided by llvm-mingw be used with LLVM official build, to support mingw target build?

In theory, yes, although I haven't actually tried to set it up.

On Unix, you can nowadays build a llvm-mingw toolchain around a preexisting install of Clang, see https://github.com/mstorsjo/llvm-mingw/blob/master/Dockerfile.system-clang for an example of this, running build-all.sh with --host-clang.

For toolchains running on Windows, there's no such premade setup script though, and it's a little tricky as we mostly cross compile those toolchains from Unix. But I guess it should be possible to alter that, like the build-all.sh case with --host-clang above.

But e.g. for a preexisting llvm-mingw toolchain on Windows, it should be possible to replace any of the actual Clang/LLVM binaries with ones from any build. E.g. for the main clang executable, in our windows toolchains, these are named e.g. clang-18.exe, and this name is hardcoded in the executable wrappers. Swapping that out for any other Clang 18.x build should be a complete drop-in. If you want to use a Clang 19.x build, it's fine to just drop it in and rename it to clang-18.exe, that should work, even if it's misnamed. But if you do that, you'll need to rename or make a copy of <prefix>/lib/clang/18 and name it <prefix>/lib/clang/19 for the new Clang to find it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure which way I prefer here - I presume you prefer the state after a29cee5 at least, with or without that change?

Yes, both look good to me. Either one will work correctly since we expect user (cmake) always use prefixed tools consistently.

if [ "$CMD_TARGET" != "$CMD_BASENAME" ]; then
case $CMD_EXE_SUFFIX in
clang|clang++|gcc|g++|c++|as|cc|c99|c11)
TARGET="$CMD_TARGET"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since clang can infer the target triple implicitly, does it make sense to have a upstream fix for clang-scan-deps to infer the target triple from CMD_EXE ? ( not from clang-scan-deps itself ! )

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it could make sense to try make upstream clang-scan-deps infer triples in such a similar way, possibly looking both at CMD_EXE and argv[0], like this script does. But that would be a later feature, maybe for LLVM 19 or so, depending on whether others agree with it.

Even if we do that, I fear we can't get rid of our wrapper script, since we need it to set -stdlib=libc++.

@huangqinjin
Copy link

The necessary patches have been merged upstream, in llvm-project and mingw-w64 (and backported to the 18.x release branch of LLVM), so this should work now.

@mstorsjo Great job! Could you please point out the corresponding upstream commits/PRs here for reference. Thanks!

@mstorsjo
Copy link
Owner Author

The necessary patches have been merged upstream, in llvm-project and mingw-w64 (and backported to the 18.x release branch of LLVM), so this should work now.

@mstorsjo Great job! Could you please point out the corresponding upstream commits/PRs here for reference. Thanks!

Thanks for having a look!

llvm/llvm-project@b9b7381 was the only change for libcxx.

For mingw-w64, I did mingw-w64/mingw-w64@30c0b62, mingw-w64/mingw-w64@847164b and mingw-w64/mingw-w64@1652e92.

@mstorsjo mstorsjo force-pushed the libcxx-modules branch 3 times, most recently from 27c4451 to e055a6f Compare April 30, 2024 12:20
nolange and others added 7 commits May 2, 2024 10:13
By default these files would be installed into
${CMAKE_INSTALL_PREFIX}/share/libc++/v1, where
${CMAKE_INSTALL_PREFIX} is equal to ${PREFIX}/${arch}-w64-mingw32.

However, to avoid duplicating these files for each architecture,
install them into ${PREFIX}/share/libc++/v1 instead.

While the files aren't very large (currently weighing in at around
612 KB), they're architecture independent files (and they're
more than a few; they're currently 135 files), so we should
share them if possible.
This picks the architecture by looking for a triple prefix in the
nested command, and passes it as an explicit flag, to let
clang-scan-deps find it.

We also pass -stdlib=libc++, to let it find the libc++ headers
as expected.
…dows

When setting up wrappers for Windows, rename clang-scan-deps.exe to
clang-scan-deps-real.exe, make clang-scan-deps.exe a wrapper, and
configure the wrapper to target executing clang-scan-deps-real.exe.

This makes sure that we always pass in crucial options like
-stdlib=libc++.

By default, our Clang executables are built with that option
hardcoded, configured with CLANG_DEFAULT_CXX_STDLIB=libc++
when built for Windows. However, we generally don't rely on this;
we assume the tools could be plain default executables, so we
try to provide wrappers on all common access paths.
@nolange
Copy link
Contributor

nolange commented May 2, 2024

I am a bit out of the loop, dont have the time to follow everything you 2 did (much appreciated btw.).

But I am wondering if clang-scan-deps is even used outside CMake at the moment?
Because with CMake I would expect that you set CMAKE_<LANG>_COMPILER_TARGET and CMake should figure everything out.

No need for wrappers with CMake, if you use the llvm-mingw toolchain with a fitting toolchain-file you do not need any symlinks or wrapper scripts (I am using it exclusively that way).

I am planing to to a PR one day, but this is what I actually use, needs to put into the files share/llvm-aarch64-w64-mingw32_toolchainfile.cmake share/llvm-armv7-w64-mingw32_toolchainfile.cmake share/llvm-i686-w64-mingw32_toolchainfile.cmake share/llvm-x86_64-w64-mingw32_toolchainfile.cmake:

if(CMAKE_VERSION VERSION_GREATER_EQUAL "3.20")
  set(_prefix "${CMAKE_CURRENT_LIST_DIR}/../")
  cmake_path(ABSOLUTE_PATH _prefix NORMALIZE)
  set(_filename "${CMAKE_CURRENT_LIST_FILE}")
  cmake_path(GET _filename FILENAME _filename)
else()
  get_filename_component(_prefix "${CMAKE_CURRENT_LIST_DIR}/../" ABSOLUTE)
  get_filename_component(_filename "${CMAKE_CURRENT_LIST_FILE}" NAME)
endif()

string(REGEX REPLACE "^llvm[_-]\(.*\)[_-]toolchain.*" "\\1" _arch "${_filename}")
string(REGEX REPLACE "^.*-mingw32\(.*\)" "\\1" _mingwtype "${_arch}")
string(REGEX REPLACE "-.*" "" _arch "${_arch}")
unset(_filename)

set(_exesuff)
if(CMAKE_HOST_WIN32)
set(_exesuff .exe)
endif()

set(CMAKE_SYSTEM_NAME Windows)
if(_arch)
  set(CMAKE_SYSTEM_PROCESSOR ${_arch})
else()
  set(CMAKE_SYSTEM_PROCESSOR ${CMAKE_HOST_SYSTEM_PROCESSOR})
endif()

set(CMAKE_ASM_COMPILER "${_prefix}bin/clang${_exesuff}")
set(CMAKE_C_COMPILER "${_prefix}bin/clang${_exesuff}")
set(CMAKE_CXX_COMPILER "${_prefix}bin/clang++${_exesuff}")
set(CMAKE_RC_COMPILER "${_prefix}bin/llvm-rc${_exesuff}")

set(CMAKE_CXX_FLAGS_INIT "-stdlib=libc++")

foreach(_lang ASM C CXX)
  set(CMAKE_${_lang}_COMPILER_TARGET ${CMAKE_SYSTEM_PROCESSOR}-w64-mingw32${_mingwtype})
  if(_mingwtype STREQUAL "uwp")
    set(CMAKE_${_lang}_FLAGS_INIT "${CMAKE_${_lang}_FLAGS_INIT} -D_WIN32_WINNT=0x0A00 -DWINVER=0x0A00 -DWINAPI_FAMILY=WINAPI_FAMILY_APP -DUNICODE -D_UCRT")
  endif()
endforeach()
unset(_lang)

set(_linker lld)
# set(_linker_extra " -Wl,--undefined-version")
if(_mingwtype STREQUAL "uwp")
  set(_linker_extra "${_linker_extra} -Wl,-lwindowsapp -Wl,-lucrtapp")
endif()
set(CMAKE_EXE_LINKER_FLAGS_INIT "--start-no-unused-arguments -stdlib=libc++ -fuse-ld=${_linker} -rtlib=compiler-rt -unwindlib=libunwind${_linker_extra} --end-no-unused-arguments")
set(CMAKE_MODULE_LINKER_FLAGS_INIT "${CMAKE_EXE_LINKER_FLAGS_INIT}")
set(CMAKE_SHARED_LINKER_FLAGS_INIT "${CMAKE_EXE_LINKER_FLAGS_INIT}")
set(CMAKE_LINKER ${_prefix}bin/ld.${_linker})
unset(_linker_extra)
unset(_linker)

set(CMAKE_SYSROOT "${_prefix}${CMAKE_SYSTEM_PROCESSOR}-w64-mingw32")

unset(_mingwtype)
unset(_prefix)

set(ENV{PKG_CONFIG_SYSROOT_DIR} "${CMAKE_SYSROOT}")
if($ENV{PKG_CONFIG_LIBDIR})
	set(ENV{PKG_CONFIG_LIBDIR} "${CMAKE_SYSROOT}/usr/lib/pkgconfig:${CMAKE_SYSROOT}/usr/share/pkgconfig:$ENV{PKG_CONFIG_LIBDIR}")
else()
	set(ENV{PKG_CONFIG_LIBDIR} "${CMAKE_SYSROOT}/usr/lib/pkgconfig:${CMAKE_SYSROOT}/usr/share/pkgconfig")
endif()

set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

@mstorsjo
Copy link
Owner Author

mstorsjo commented May 2, 2024

But I am wondering if clang-scan-deps is even used outside CMake at the moment? Because with CMake I would expect that you set CMAKE__COMPILER_TARGET and CMake should figure everything out.

Not sure how much other build systems even support C++ modules yet; once they do, they most probably will need to use clang-scan-deps in one form or another.

Yes, it’s certainly possible to avoid the need for wrappers, if invoking clang-scan-deps directly with the right argument - but then one also need to make sure to supply the -stdlib=libc++ option.

No need for wrappers with CMake, if you use the llvm-mingw toolchain with a fitting toolchain-file you do not need any symlinks or wrapper scripts (I am using it exclusively that way).

Sure, that’s one way of doing it. Whether it is the preferred way or not, probably mostly is personal preference. The main design behind llvm-mingw so far is the opposite; providing wrappers that contain all these settings, to avoid needing complicated cmake setup (yes, I know it’s mostly contained within the toolchain file - but still), where it is enough to supply the names of executables, or with some build systems, just supply the triple prefix.

As for making a PR to add such files, I’m a little undecided. It’s not my preferred way of using cmake. Adding it might not be harmful in itself even if I don’t use it, but it adds another file to maintain and test that it stays working. So I’m not quite sold on the concept.

@mstorsjo mstorsjo marked this pull request as ready for review May 2, 2024 17:03
@mstorsjo
Copy link
Owner Author

mstorsjo commented May 2, 2024

I’ll merge this PR now, so these additions are included in the new release that will be out soon.

@mstorsjo mstorsjo merged commit 6c748d2 into master May 2, 2024
38 checks passed
@mstorsjo mstorsjo deleted the libcxx-modules branch May 2, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants