-
Notifications
You must be signed in to change notification settings - Fork 79
Implement D3663R2 (Future-proof submdspan_mapping) #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
mhoemmen
wants to merge
46
commits into
kokkos:stable
Choose a base branch
from
mhoemmen:P3663
base: stable
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
It currently requires the C++23 feature "deducing this" in order to compile. I've tested it with dev Clang (21.0.0). It should work fine with Clang 20.
Also implement exposition-only __mdspan_integral_constant_like concept, which is useful for canonicalization and Mandates.
sizeof...(Slices) == 0 case builds and passes tests.
Set the following CMake options. -DCMAKE_CXX_FLAGS="-std=c++2c" -DMDSPAN_CXX_STANDARD=26 You'll get a warning, but it seems to work.
* canonical-ice * subtract-ice * de-ice * check-static-bounds
submdspan_canonicalize_slices now works for one strided_slice. Fix canonical_ice (wrong order in subtraction; wording is fine).
Attempt to change over submdspan to use canonicalization without doing much else. It fails to build; a lot of layout mappings seem to get the wrong mapping result type. It builds and passes tests with P3663 support off.
submdspan builds now when submdspan_mapping is called with canonical slices, but the submdspan test fails. Investigate whether this is a bug in the implementation.
Fix extent computation for std::complex. It builds and passes tests. NEXT STEPS: * Make it ill-formed to call submdspan_mapping with non-canonical slice types * Make P3663 version only use constant_wrapper and canonical slice types in submdspan_mapping_impl functions
(submdspan_mapping_impl)
All submdspan_mapping_impl functions now check that every k-th slice type is a canonical k-th slice type. This ensures that submdspan_mapping_impl is only well-formed for canonical k-th slice types, and is ill-formed otherwise.
in P3663 branch only.
Don't need to specialize for all integral-constant-like, just for std::constant_wrapper.
It currently exercises the following cases for a variety of extents types. * full_extent_t * integral_constant<IndexType, Value> * IndexType
Improve testing for the case where a structured binding of the slice into two elements is valid. Test includes both aggregates and non-aggregates that opt into the tuple protocol.
The test should now have complete coverage of all the cases, for all combinations of static and/or dynamic extents.
Currently only the CPU version works.
The current (non-P3663) mdspan implementation incorrectly cannot handle index or slice types that are convertible to index_type, but are not integral-non-bool. This commit fixes that, and adds unit tests for both layout mapping indexing and submdspan slicing.
Make sure that the benchmark gets the MDSPAN_ENABLE_P3663 definition (it wasn't before). Add new benchmarks that make it harder for the compiler to optimize away code. (They actually change all the values in the mdspan and check that the changes are mathematically correct.) In benchmark, for the slice type that's convertible to full_extent_t, make the conversion operator noexcept.
CUDA version is as yet untested. It probably won't compile, at least because the P3663 implementation currently relies on C++23 or 26 language features that may not yet be implemented in NVCC.
... in constant_wrapper. The issue is with the default argument of the second template argument of constant_wrapper. typename unspecified = typename decltype(exposition_only::cw_fixed_value(X))::type Replacing that expression with use of the following alias doesn't help. namespace exposition_only { template<auto X> using unspecified_t = typename decltype(cw_fixed_value(X))::type; } Moving the definitions of the various specializations of cw_fixed_value above this point doesn't help either. This generally works fine even with Clang 21.
Work around lack of parameter pack indexing and fix constant_wrapper's assignment operator. After some adjustments to the constant_wrapper implementation, I found out that its overloaded arithmetic operators -- specifically, binary operator+ -- do not work. .../submdspan_extents.hpp:96:22: error: static assertion failed 96 | static_assert(std::is_same_v< | ~~~~~^~~~~~~~~~ 97 | decltype(counter + std::cw<size_t(1)>), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 98 | std::constant_wrapper<counter_value + size_t(1)> | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 99 | >); Binary operator+ doesn't kick in; the first branch of the `is_same_v` is `unsigned long`, i.e., `size_t`.
Define MDSPAN_CONSTANT_WRAPPER_GCC_WORKAROUND to 1 to get the work-around for GCC 11.4.0 C++20. The key is not to rely on constant_wrapper's binary operator-. I've tried a few designs for overloaded operators on constant_wrapper, but none of them work with GCC 11.4.0.
Remove constant_wrapper's overloaded arithmetic operators, since the work-around doesn't need them.
* Rename option for working around constant_wrapper compiler issues from MDSPAN_CONSTANT_WRAPPER_GCC_WORKAROUND to MDSPAN_CONSTANT_WRAPPER_WORKAROUND. * Make MDSPAN_CONSTANT_WRAPPER_WORKAROUND a CMake option. It's OFF by default. It only has an effect if the CMake option MDSPAN_ENABLE_P3663 is ON.
GCC 15.1.0 can now build with the following options. MDSPAN_ENABLE_P3663=ON MDSPAN_CONSTANT_WRAPPER_WORKAROUND=OFF CMAKE_CXX_FLAGS="-std=c++2c" MDSPAN_CXX_STANDARD=26
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Implement draft R2 of P3663 (Future-proof
submdspan_mapping
).Priorities, in decreasing order:
Feature is off by default. To enable it, set the CMake option
MDSPAN_ENABLE_P3663
toON
.There are two versions of the implementation. The default version requires C++20. The nondefault version requires the C++23 feature "deducing this," and optionally can make use of the C++26 feature "pack indexing" (P2662R3) if available. In order to support these features, I made some changes to mdspan's CMake logic to permit
-std=c++2c
with Clang 21.0.0 and GCC 15.1.0.I've tested the default version with Clang 14 and GCC 11.4.0, and the nondefault version with Clang 21.0.0 (development Clang) and GCC 15.1.0. To enable the default version, please use the following CMake options.
To enable the nondefault version, please use the following CMake options.