-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Update libcxx and libcxxabi to LLVM 19.1.4 #22994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update libcxx and libcxxabi to LLVM 19.1.4 #22994
Conversation
8936564
to
f619917
Compare
This is our emscripten-specific configuration file and was mistakenly deleted when I ran update_libcxx.py.
This file was added as a part of LLVM 18 update (emscripten-core#21638) in emscripten-core@8d51927 and mistakenly deleted when I ran update_libcxx.py. This file was copied from https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/vendor/llvm/default_assertion_handler.in, so this also updates the file with the newest `default_assertion_handler.in`.
These Emscripten-specific files were mistakenly deleted when I ran update_libcxx.py.
https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/src/expected.cpp was added in llvm/llvm-project#87390 and this file assumes C++23 to be compiled. Apparently libc++ sources are always built with C++23 so they don't guard things against it in `src/`: llvm/llvm-project#87390 (comment) This also bumps libc++abi to C++23 because... why not
We have excluded files in https://github.com/emscripten-core/emscripten/tree/main/system/lib/libcxx/src/support/win32. This is a new file added in this directory in llvm/llvm-project#83575.
We disabled C++20 time zone support in LLVM 18 update (emscripten-core#21638): emscripten-core@df9af64 The list of source files related to time zone support has changed in llvm/llvm-project#74928, so this commit reflects it.
Only `char`, `wchar`, `char8`, `char16`, and `char32` are valid specialization for `std::basic_string`: https://en.cppreference.com/w/cpp/string/basic_string But libc++ had a base template for `basic_string` that allows any type to be passed for a long time. It looks there have been several attempts to remove this but they restored it afterwards due to some complaints, in chronological order: llvm/llvm-project@aeecef0 llvm/llvm-project@08a0faf llvm/llvm-project@e30a148 llvm/llvm-project#66153 llvm/llvm-project#72694 The last one, llvm/llvm-project#72694, eventually removed it. So `std::basic_string<unsigned_char>` is not allowed anymore.
812cb80
to
643050f
Compare
Not sure why but some of them decreasd by ~3%. Increases don't seem to be meaningful; they are usually ~0.3%.
Only `char`, `wchar`, `char8`, `char16`, and `char32` are valid specialization for `std::basic_string`: https://en.cppreference.com/w/cpp/string/basic_string But libc++ had a base template for `basic_string` that allowed any type to be passed for a long time. It looks there have been several attempts to remove this after which they restored it due to complaints, in chronological order: llvm/llvm-project@aeecef0 llvm/llvm-project@08a0faf llvm/llvm-project@e30a148 llvm/llvm-project#66153 llvm/llvm-project#72694 The last one, llvm/llvm-project#72694, eventually removed it. So `std::basic_string<unsigned_char>` is not allowed anymore. This removes all uses of `std::basic_string<unsigned_char>` from embind. This needs to be done to update libc++ to LLVM 19 (emscripten-core#22994). I'm uploading this as a separate PR because this removes a functionality from embind.
Only `char`, `wchar`, `char8`, `char16`, and `char32` are valid specialization for `std::basic_string`: https://en.cppreference.com/w/cpp/string/basic_string But libc++ had a base template for `basic_string` that allowed any type to be passed for a long time. It looks there have been several attempts to remove this after which they restored it due to complaints, in chronological order: llvm/llvm-project@aeecef0 llvm/llvm-project@08a0faf llvm/llvm-project@e30a148 llvm/llvm-project#66153 llvm/llvm-project#72694 The last one, llvm/llvm-project#72694, eventually removed it. So `std::basic_string<unsigned_char>` is not allowed anymore. This removes all uses of `std::basic_string<unsigned_char>` from embind. This needs to be done to update libc++ to LLVM 19 (emscripten-core#22994). I'm uploading this as a separate PR because this removes a functionality from embind.
This reverts commit 805a8f0.
Only `char`, `wchar`, `char8`, `char16`, and `char32` are valid specializations for `std::basic_string`: https://en.cppreference.com/w/cpp/string/basic_string But libc++ had a base template for `basic_string` that allowed any type to be passed for a long time. It looks there have been several attempts to remove this after which they restored it due to complaints, in chronological order: llvm/llvm-project@aeecef0 llvm/llvm-project@08a0faf llvm/llvm-project@e30a148 llvm/llvm-project#66153 llvm/llvm-project#72694 The last one, llvm/llvm-project#72694, eventually removed it. So `std::basic_string<unsigned char>` is not allowed anymore. This removes all uses of `std::basic_string<unsigned char>` from embind. This needs to be done to update libc++ to LLVM 19 (emscripten-core#22994). I'm uploading this as a separate PR because this removes a functionality from embind.
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in emscripten-core#23195 (comment) while discussins `__FILE__` problem in emscripten-core#23915. This does not change any code size tests because no code size tests happens to contain `__FILE__` at the moment. (One will be added in emscripten-core#22994)
#23222) When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in #23195 (comment) while discussins `__FILE__` problem in #23195. This does not change any code size tests because no code size tests happens to contain `__FILE__` at the moment. (One will be added in #22994)
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (11) test expectation files were updated by running the tests with `--rebaseline`: ``` code_size/embind_hello_wasm.json: 18011 => 17910 [-101 bytes / -0.56%] code_size/embind_val_wasm.json: 17198 => 16938 [-260 bytes / -1.51%] other/codesize/test_codesize_cxx_ctors1.size: 128894 => 129570 [+676 bytes / +0.52%] other/codesize/test_codesize_cxx_ctors2.size: 128343 => 129019 [+676 bytes / +0.53%] other/codesize/test_codesize_cxx_except.size: 170950 => 171248 [+298 bytes / +0.17%] other/codesize/test_codesize_cxx_except_wasm.size: 142154 => 142510 [+356 bytes / +0.25%] other/codesize/test_codesize_cxx_except_wasm_exnref.size: 144741 => 145056 [+315 bytes / +0.22%] other/codesize/test_codesize_cxx_lto.size: 121844 => 122262 [+418 bytes / +0.34%] other/codesize/test_codesize_cxx_mangle.size: 232468 => 233088 [+620 bytes / +0.27%] other/codesize/test_codesize_cxx_noexcept.size: 131701 => 132146 [+445 bytes / +0.34%] other/codesize/test_codesize_cxx_wasmfs.size: 168848 => 169407 [+559 bytes / +0.33%] Average change: +0.08% (-1.51% - +0.53%) ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC from reading the comments non of the change in the system
directory contains any new modifications (they all come from upstream and/or preexisting changes)?
.circleci/config.yml
Outdated
@@ -808,7 +808,8 @@ jobs: | |||
- run-tests-linux: | |||
# some native-dependent tests fail because of the lack of native | |||
# headers on emsdk-bundled clang | |||
test_targets: "other skip:other.test_native_link_error_message" | |||
test_targets: " | |||
other.test_minimal_runtime_code_size_hello_embind" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can revert this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought I reverted all CI testing changes but this remained... Reverted.
This reverts commit 76a4466.
I had omitted a few changes before, but re-added all additional changes to the PR description now. |
Our prevous `__assertion_handler` was copied from `libcxx/vendor/llvm/default_assertion_handler.in`. This updates out `__assertion_handler` to the new `libcxx/vendor/llvm/default_assertion_handler.in`. For what this file is, see the PR description of emscripten-core#22994, with which the file was added.
Our prevous `__assertion_handler` was copied from `libcxx/vendor/llvm/default_assertion_handler.in`. This updates out `__assertion_handler` to the new `libcxx/vendor/llvm/default_assertion_handler.in`. For what this file is, see the PR description of emscripten-core#22994, with which the file was added.
This updates libcxx and libcxxabi to LLVM 20.1.4: https://github.com/llvm/llvm-project/releases/tag/llvmorg-20.1.4 Before going into each additional change of the PR, these are some noteworthy changes from this LLVM release: - Freezing C++03 headers - RFC: https://discourse.llvm.org/t/rfc-freezing-c-03-headers-in-libc/77319 - PRs: - llvm/llvm-project#108999 - llvm/llvm-project#109000 - llvm/llvm-project#109001 - llvm/llvm-project#109002 - This copies libc++ headers of the last LLVM 19 release into a separate directory (https://github.com/llvm/llvm-project/tree/main/libcxx/include/__cxx03) and redirects all C++03 workflow there. The motivation is not to fix C++03 related changes unless they are critical bugs, and simplifies the main headers. So the main headers are like ``` #if __cplusplus < 201103L && defined(_LIBCPP_USE_FROZEN_CXX03_HEADERS) # include <__cxx03/algorithm> #else ... main header content ... ``` As you can see it looks like we can avoid opting into this by not defining `_LIBCPP_USE_FROZEN_CXX03_HEADERS` at the moment. I think their eventual goal is to remove C++03 support from the main headers but it hasn't seem to have happened yet and I don't know what their timeline for that is. Adding that `__cxx03` directory increases the header size by 10MB. We can get away not using them by not defining `_LIBCPP_USE_FROZEN_CXX03_HEADERS` in this release, so this update does not include that directory. - Sharing of headers between libc++ and libc (Project Hand in Hand) - RFC: https://discourse.llvm.org/t/rfc-project-hand-in-hand-llvm-libc-libc-code-sharing/77701 - PR: llvm/llvm-project#91651 (More are likely to come) This tries to share some libc headers from libcxx. libcxx's source files can depend on headers from `libc/shared`, `libc/src/__support`, `libc/include/llvm-libc-macros`, and `libc/include/llvm-libc-types`. And it turns out libcxx can also depend on `libc/hdr`, even though the directory is not in the diagram in the RFC. These headers can be only included from `libcxx/src` and not `libcxx/include`, to prevent libcxx's API from changing. So these headers don't need to be copied into `cache/`. - Locale API reimplementation It doesn't seem to have an RFC, but the PRs are: - llvm/llvm-project#113737 - llvm/llvm-project#115176 - llvm/llvm-project#115752 - llvm/llvm-project#122489 It looks this aims provide a new way to define locale API for each platform. Currently Apple, FreeBSD, MSVCRT, and Fuschia are using the new API and the others are still using the old one: https://github.com/llvm/llvm-project/blob/ec28b8f9cc7f2ac187d8a617a6d08d5e56f9120e/libcxx/include/__locale_dir/locale_base_api.h#L116-L138 For our purpose, adding `if defined(__EMSCRIPTEN__)` entry to the list of "old" list seems to work for the moment, but we may need to move to the new way eventually later. --- Additional changes: - Copy `vendor/llvm/default_assertion_handler.in` to `__assertion_handler`: aa53648 Our previous `__assertion_handler` was copied from `libcxx/vendor/llvm/default_assertion_handler.in`. This updates our `__assertion_handler` to the new `libcxx/vendor/llvm/default_assertion_handler.in`. For what this file is, see the PR description of #22994, with which the file was added. - Remove `libcxx/include/__cxx03` directory: d646f6b As I described in "Freezing C++ headers" above, C++03 headers were copied to `include/cxx03` to be "frozen". But by not defining `_LIBCPP_USE_FROZEN_CXX03_HEADERS` we can still use the main headers. This new `__cxx03` header directory is almost 10M and we are not using it in this update, this deletes it. - Define more variables in `__config_site`: 9912236 libcxx decides to almost all configuration macros to be defined. So before it tested whether a macro was defined/undefined, and now it assumes it is defined and tests whether its value is 1/0. Before llvm/llvm-project#112094 `_config_site.in` used to use `#cmakedefine`, which only defined variables when they were enabled, but now it uses `#cmakedefine01`, which always defines variables and assigns 1 or 0 depending on whether the feature is enabled. So this change adds `#define` to each variable that `_config_site.in` has an entry of. The value assigned is 1 if it was defined in our previous Emscripten environment and 0 if not. - Fix Emscripten's locale: 2ae01b0 Before, the code was like https://github.com/emscripten-core/emscripten/blob/dc1abd514b1bade135a01a4453a9ff6def0793b6/system/lib/libcxx/include/__locale_dir/locale_base_api.h#L12-L30 But now they are divided into two parts, one using the new API and the other using old. See "Locale API reimplementation" above. https://github.com/llvm/llvm-project/blob/ec28b8f9cc7f2ac187d8a617a6d08d5e56f9120e/libcxx/include/__locale_dir/locale_base_api.h#L116-L138 This adds Emscripten in the beginning of the "old" API list. (This has to be the beginning; see #23414) - Import libc headers used by libcxx: 12a4ee4, 12a4ee4 and 43c8ce4 This imports a part of libc headers into `system/lib/llvm-libc`. The imported directories are: - `libc/shared` - `libc/src/__support` - `libc/include/llvm-libc-macros` - `libc/include/llvm-libc-types` - `libc/hdr` See "sharing of headers between libc+ and libc" above for details. This also applies llvm/llvm-project#133999, which is a bugfix that has not be backported, which fixes the bug of including from a wrong directory. - `std::uncaught_exception` -> `std::uncaught_exceptions`: 1bf4e78 `std::uncaught_exception` has been deprecated in C++17, so it generates a warning (which we treat as an error). Replaced it with `std::uncaught_exceptions`, which returns the number of uncaught exceptions (but can still be used as a boolean in the test). - Remove `ryu_constants.h` and `ryu_long_double_constants.h` from libc: 5767ac4 These are not used from libcxx and `ryu_long_double_constants.h` is huge (12M).
This updates libcxx and libcxxabi to LLVM 19.1.4:
https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.4
The initial update was done using
update_libcxx.py
andupdate_libcxxabi.py
, and subsequent fixes were made in indidual commits. The commit history here is kind of messy because of CI testing so not all individual commits are noteworthy.Additional changes:
Build libcxx and libcxxabi with C++23: 8b0bfdf
https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/src/expected.cpp
was added in [libc++] Use availability to rely on key functions for bad_expected_access and bad_function_call llvm/llvm-project#87390 and this file assumes C++23 to be compiled. Apparently libc++ sources are always built with C++23 so they don't guard things against it in
src/
: [libc++] Use availability to rely on key functions for bad_expected_access and bad_function_call llvm/llvm-project#87390 (comment)This commit also builds libc++abi with C++23 because it doesn't seem there's any downside to it.
Exclude newly added
compiler_rt_shims.cpp
: 5bbcbf0We have excluded files in https://github.com/emscripten-core/emscripten/tree/main/system/lib/libcxx/src/support/win32.
This is a new file added in this directory in [libc++] Use _Complex for multiplication and division of complex floating point types llvm/llvm-project#83575.
Disable time zone support: a5f2cbe
We disabled C++20 time zone support in LLVM 18 update (Update libcxx and libcxxabi to LLVM 18.1.2 #21638): df9af64
The list of source files related to time zone support has changed in [libc++][chrono] Loads tzdata.zi in tzdb. llvm/llvm-project#74928, so this commit reflects it.
Re-add + update
__assertion_handler
fromdefault_assertion_handler.in
: 41f8037This file was added as a part of LLVM 18 update (Update libcxx and libcxxabi to LLVM 18.1.2 #21638) in 8d51927 and mistakenly deleted when I ran
update_libcxx.py
. This file was copied from https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/vendor/llvm/default_assertion_handler.in, so this also updates the file with the newestdefault_assertion_handler.in
._LIBCPP_PSTL_CPU_BACKEND_SERIAL
->_LIBCPP_PSTL_BACKEND_SERIAL
: 4b969c3The name changed in this update, so reflecting it on our
__config_site
.Directly include
pthread.h
fromemscripten/val.h
: a5a76c3Due to file rearrangements happened, this was necessary.
Other things to note:
std::basic_string<unsigned_char>
is not supported anymoreThe support for
std::basic_string<unsigned_char>
, which was being used by embind, is removed in this version.Remove basic_string<unsigned char> from embind #23070 removes the uses from embind.libcxxabi uses
__FILE__
in more places[libc++abi] Replace usage of raw assert by _LIBCXXABI_ASSERT llvm/llvm-project#80689 started to use
_LIBCXXABI_ASSERT
, which uses__FILE__
, inprivate_typeinfo.cpp
.__FILE__
macro produces different paths depending on the local directory names and how the file is given to clang in the command line, and this file is included in the result of one of our code size tests,other.test_minimal_runtime_code_size_hello_embind
, which caused the result of the test to vary depending on the CI bots and how the library is built (e.g., whether by embuilder, ninja, or neither).Even though this was brought to surface during this LLVM 19 update,
__FILE__
macro could be a problem for producing reproducible builds anyway. We discussed this problem in Handling __FILE__ in code size tests in CircleCI #23195, and the fixes landed in Use consistent prefix map when building deterministic system libraries #23222, Add deterministic prefix in Library.get_flags() (NFC) #23225, and Always enable deterministic_paths #23256.