Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

combine() combines hash values with recursion on variadic parameters.

This patch replaces the recursion with a C++17 fold expression:

(combine_data(length, buffer_ptr, buffer_end, get_hashable_data(args)),
...);

which expands to:

combine_data(length, buffer_ptr, buffer_end, get_hashable_data(a));
combine_data(length, buffer_ptr, buffer_end, get_hashable_data(b));
combine_data(length, buffer_ptr, buffer_end, get_hashable_data(c));
:

A key benefit of this change is the unification of the recursive step
and the base case. The argument processing and finalization logic now
exist as straight-line code within a single function.

combine_data now takes buffer_ptr by reference. This is necessary
because the previous assignment pattern:

buffer_ptr = combine_data(...)

is syntactically incompatible with a fold expression. The new pattern:

(combine_data(...), ...)

discards return values, so combine_data must update buffer_ptr
directly.

For readability, this patch does the bare minimum to use a fold
expression, leaving further cleanups to subsequent patches. For
example, buffer_ptr and buffer_end could become member variables, and
several comments that mention recursion still need updating.

combine() combines hash values with recursion on variadic parameters.

This patch replaces the recursion with a C++17 fold expression:

  (combine_data(length, buffer_ptr, buffer_end, get_hashable_data(args)),
   ...);

which expands to:

  combine_data(length, buffer_ptr, buffer_end, get_hashable_data(a));
  combine_data(length, buffer_ptr, buffer_end, get_hashable_data(b));
  combine_data(length, buffer_ptr, buffer_end, get_hashable_data(c));
  :

A key benefit of this change is the unification of the recursive step
and the base case. The argument processing and finalization logic now
exist as straight-line code within a single function.

combine_data now takes buffer_ptr by reference. This is necessary
because the previous assignment pattern:

  buffer_ptr = combine_data(...)

is syntactically incompatible with a fold expression. The new pattern:

  (combine_data(...), ...)

discards return values, so combine_data must update buffer_ptr
directly.

For readability, this patch does the bare minimum to use a fold
expression, leaving further cleanups to subsequent patches.  For
example, buffer_ptr and buffer_end could become member variables, and
several comments that mention recursion still need updating.
@llvmbot
Copy link
Member

llvmbot commented Sep 20, 2025

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

combine() combines hash values with recursion on variadic parameters.

This patch replaces the recursion with a C++17 fold expression:

(combine_data(length, buffer_ptr, buffer_end, get_hashable_data(args)),
...);

which expands to:

combine_data(length, buffer_ptr, buffer_end, get_hashable_data(a));
combine_data(length, buffer_ptr, buffer_end, get_hashable_data(b));
combine_data(length, buffer_ptr, buffer_end, get_hashable_data(c));
:

A key benefit of this change is the unification of the recursive step
and the base case. The argument processing and finalization logic now
exist as straight-line code within a single function.

combine_data now takes buffer_ptr by reference. This is necessary
because the previous assignment pattern:

buffer_ptr = combine_data(...)

is syntactically incompatible with a fold expression. The new pattern:

(combine_data(...), ...)

discards return values, so combine_data must update buffer_ptr
directly.

For readability, this patch does the bare minimum to use a fold
expression, leaving further cleanups to subsequent patches. For
example, buffer_ptr and buffer_end could become member variables, and
several comments that mention recursion still need updating.


Full diff: https://github.com/llvm/llvm-project/pull/159901.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/Hashing.h (+8-14)
diff --git a/llvm/include/llvm/ADT/Hashing.h b/llvm/include/llvm/ADT/Hashing.h
index 41a730e24a6b1..ffe231249a5c2 100644
--- a/llvm/include/llvm/ADT/Hashing.h
+++ b/llvm/include/llvm/ADT/Hashing.h
@@ -503,7 +503,8 @@ struct hash_combine_recursive_helper {
   /// hash_state, empties it, and then merges the new chunk in. This also
   /// handles cases where the data straddles the end of the buffer.
   template <typename T>
-  char *combine_data(size_t &length, char *buffer_ptr, char *buffer_end, T data) {
+  char *combine_data(size_t &length, char *&buffer_ptr, char *buffer_end,
+                     T data) {
     if (!store_and_advance(buffer_ptr, buffer_end, data)) {
       // Check for skew which prevents the buffer from being packed, and do
       // a partial store into the buffer to fill it. This is only a concern
@@ -541,21 +542,14 @@ struct hash_combine_recursive_helper {
   ///
   /// This function recurses through each argument, combining that argument
   /// into a single hash.
-  template <typename T, typename ...Ts>
+  template <typename... Ts>
   hash_code combine(size_t length, char *buffer_ptr, char *buffer_end,
-                    const T &arg, const Ts &...args) {
-    buffer_ptr = combine_data(length, buffer_ptr, buffer_end, get_hashable_data(arg));
+                    const Ts &...args) {
+    (combine_data(length, buffer_ptr, buffer_end, get_hashable_data(args)),
+     ...);
 
-    // Recurse to the next argument.
-    return combine(length, buffer_ptr, buffer_end, args...);
-  }
-
-  /// Base case for recursive, variadic combining.
-  ///
-  /// The base case when combining arguments recursively is reached when all
-  /// arguments have been handled. It flushes the remaining buffer and
-  /// constructs a hash_code.
-  hash_code combine(size_t length, char *buffer_ptr, char *buffer_end) {
+    // Finalize the hash by flushing any remaining data in the buffer.
+    //
     // Check whether the entire set of values fit in the buffer. If so, we'll
     // use the optimized short hashing routine and skip state entirely.
     if (length == 0)

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't read the PR description but the code looks good

@kazutakahirata kazutakahirata merged commit b3c7d25 into llvm:main Sep 20, 2025
9 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_20250919_10TMP_ADT_hash_combiner branch September 20, 2025 16:51
kazutakahirata added a commit to kazutakahirata/llvm-project that referenced this pull request Sep 20, 2025
We just started using a C++17 fold expression inside combine
(llvm#159901).  We still carry the style that was suitable for recursive
calls of combine_data.  Specifically, we keep passing several state
variables as parameters of combine_data.

Now that we no longer use recursion, this patch simplifies
hash_combine_recursive_helper by making buffer_ptr, buffer_end, and
length member variables while dropping the return value from
hash_combine.  This patch also names hash_combine_recursive_helper to
hash_combine_helper.

I will follow up to update comments that still mention "recursion".
@nikic
Copy link
Contributor

nikic commented Sep 20, 2025

This appears to have a large negative effect when building with clang: https://llvm-compile-time-tracker.com/compare.php?from=507f394d03d8cb24c90ae4b2d28e8afae3b4088d&to=b3c7d25422355949227a12ded7515f5114db5bda&stat=instructions%3Au (Maybe related to making buffer_ptr a reference?)

kazutakahirata added a commit that referenced this pull request Sep 21, 2025
We just started using a C++17 fold expression inside combine
(#159901).  We still carry the style that was suitable for recursive
calls of combine_data.  Specifically, we keep passing several state
variables as parameters of combine_data.

Now that we no longer use recursion, this patch simplifies
hash_combine_recursive_helper by making buffer_ptr, buffer_end, and
length member variables while dropping the return value from
hash_combine.  This patch also names hash_combine_recursive_helper to
hash_combine_helper.

I will follow up to update comments that still mention "recursion".
kazutakahirata added a commit to kazutakahirata/llvm-project that referenced this pull request Sep 21, 2025
Now that we have removed recursion in llvm#159901 and llvm#159938, this patch
updates those comments that still mention recursion.
nikic added a commit that referenced this pull request Sep 21, 2025
)"

This has a negative impact on compile times.

This reverts commit a36a1ec.
This reverts commit b3c7d25.
@nikic
Copy link
Contributor

nikic commented Sep 21, 2025

Reverted in 5466211.

@kazutakahirata Can you please address review feedback before you layer more patches on top?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants