Skip to content

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Oct 27, 2025

This pull request updates the argument handling logic in the AtomicAddVectorizeRewriter class to correctly construct the argument list for atomic add operations based on the vector size. The main change ensures that the correct arguments are passed for vectorized and non-vectorized atomic add calls.

Atomic add argument handling improvements:

  • The construction of new_args now pushes address_of_dst and address_of_value only for vector sizes 2 and 4, while for the scalar case, it pushes dst_node and value_node directly. This ensures that the argument list matches the expected function signature for each case.

Summary by CodeRabbit

  • Bug Fixes
    • Fixed atomic addition vectorization to prevent duplicate argument emissions during compilation, improving code generation accuracy.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 27, 2025

Walkthrough

Modified the atomic add vectorization rewriter to inline destination and value address arguments within each vector-size branch (4, 2) and use original nodes for the scalar path, eliminating duplicate argument emissions from the previous post-branching generic addition approach.

Changes

Cohort / File(s) Summary
Atomic Add Vectorization Rewriter
src/transform/atomicadd_vectorize.cc
Refactored argument construction for atomic add vectorization to inline address_of_dst/address_of_value operands per vector-size branch (4-wide, 2-wide, scalar) instead of adding them generically after branching, preventing duplicate argument emissions

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Focus areas: Verify per-branch argument inlining logic for each vectorization path (4, 2, scalar) matches intended semantics; confirm original node usage for scalar path is correct; ensure no regressions in argument ordering or operand selection

Possibly related PRs

Poem

🐰 Branching paths now clearly marked,
Arguments nested, not scattered or harked,
Four lanes, two lanes, one scalar way—
No duplicates hop 'round today!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "[Bugfix] Correctly construct the argument list for atomic add based on the vector size" directly and accurately describes the main change in the changeset. According to both the raw summary and PR objectives, the core change involves fixing argument handling logic in the AtomicAddVectorizeRewriter to ensure different argument lists are used depending on vector size (addressing different paths for vectorized sizes 2 and 4 versus the scalar case). The title is clear, specific, and concise—it identifies what is being fixed (argument list construction) and the key factor that drives the fix (vector size), allowing a teammate to understand the primary purpose of the changes at a glance.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/transform/atomicadd_vectorize.cc (1)

239-251: Consider reducing code duplication between vector size branches.

The vector_size_ == 4 and vector_size_ == 2 branches are nearly identical except for the function name string. You could refactor to reduce duplication:

+      const char* func_name = nullptr;
+      bool use_addresses = false;
+      
       if (vector_size_ == 4) {
-        new_args.push_back(StringImm("AtomicAddx4"));
-        new_args.push_back(address_of_dst);
-        new_args.push_back(address_of_value);
+        func_name = "AtomicAddx4";
+        use_addresses = true;
       } else if (vector_size_ == 2) {
-        new_args.push_back(StringImm("AtomicAddx2"));
-        new_args.push_back(address_of_dst);
-        new_args.push_back(address_of_value);
+        func_name = "AtomicAddx2";
+        use_addresses = true;
       } else {
-        new_args.push_back(StringImm("AtomicAdd"));
-        new_args.push_back(dst_node);
-        new_args.push_back(value_node);
+        func_name = "AtomicAdd";
+        use_addresses = false;
       }
+
+      new_args.push_back(StringImm(func_name));
+      if (use_addresses) {
+        new_args.push_back(address_of_dst);
+        new_args.push_back(address_of_value);
+      } else {
+        new_args.push_back(dst_node);
+        new_args.push_back(value_node);
+      }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95e7bc3 and 7e7eea0.

📒 Files selected for processing (1)
  • src/transform/atomicadd_vectorize.cc (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/transform/atomicadd_vectorize.cc (1)
tilelang/language/tir/op.py (1)
  • address_of (464-480)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
🔇 Additional comments (1)
src/transform/atomicadd_vectorize.cc (1)

234-253: LGTM! Argument construction now correctly distinguishes vectorized and scalar cases.

The fix properly ensures that:

  • Vectorized atomic adds (size 4 and 2) receive address pointers via address_of()
  • Scalar atomic adds (else branch) receive the BufferLoad nodes directly
  • The memory_order argument is appended consistently for all cases

This aligns perfectly with the PR objectives and resolves the argument list mismatch issue.

@LeiWang1999 LeiWang1999 merged commit 7d389a4 into tile-ai:main Oct 27, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant