Skip to content

Add container and tuple optimization helpers#4290

Draft
assistant-librarian[bot] wants to merge 4 commits intodevelopfrom
import/develop/ROCm_composable_kernel/pr-3590
Draft

Add container and tuple optimization helpers#4290
assistant-librarian[bot] wants to merge 4 commits intodevelopfrom
import/develop/ROCm_composable_kernel/pr-3590

Conversation

@assistant-librarian
Copy link
Contributor

@assistant-librarian assistant-librarian bot commented Feb 3, 2026

Summary

  • Replace lambdas with named functors in container_concat
  • Add make_uniform_tuple helper for repeated value patterns
  • Add container_product helper with O(1) depth fold expression
  • Reduces container_concat instantiations from 186 to 93 (50% reduction)

Why It Works

Lambda expressions in container_concat created unique types at each call site. The make_tuple_functor named struct shares one type across all uses, halving instantiation count.

The make_uniform_tuple helper eliminates repeated lambda instantiations for creating tuples with the same value repeated N times.

Test Plan

  • Added 12 unit tests for container_concat and make_uniform_tuple helpers
  • Waiting for full CI

PR Stack

This PR is part of the build time optimization effort (issue #4229). All PRs now target develop independently:

# PR Description Status
1 ROCm/composable_kernel#3585 sequence_gen with __make_integer_seq Independent
2 #4283 generate_identity_sequences + named functors New (replaces ROCm/composable_kernel#3588, ROCm/composable_kernel#3589)
3 #4290 container_concat optimization This PR
4 #4288 O(1) pack expansion rewrites Independent
5 #4287 TensorDescriptor/TensorAdaptor lambda elimination Independent

Tracking issue: #4229


🔁 Imported from ROCm/composable_kernel#3590
🧑‍💻 Originally authored by @tenpercent

tenpercent and others added 4 commits January 22, 2026 01:47
- Replace lambdas with named functors in container_concat
- Add make_uniform_tuple helper for repeated value patterns
- Add container_product helper with O(1) depth fold expression
- Add merge_sequences_functor and unpack_and_merge_sequences
- Add 16 unit tests for container helpers

Co-Authored-By: Claude <noreply@anthropic.com>
Detailed comments explain:
- Why named functors reduce instantiations vs lambdas in container_concat
- Impact: 50% reduction in container_concat (186 → 93 instantiations)
- make_uniform_tuple optimization using pack expansion instead of lambda
- generate_identity_sequences optimization for identity permutations
- When to apply these patterns elsewhere

This documentation helps maintainers understand the build-time optimization
strategies and prevents reverting to less efficient patterns.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant