[Refactor]:Move device_assert from extern_call to intrin_call #1134

yyttt6 · 2025-10-27T11:45:02Z

Summary by CodeRabbit

New Features
- Device-side assertion support: two intrinsics (assert and assert-with-message) enable runtime checks inside kernels with optional messages.
- CUDA codegen emits these device assertions so failures are reported from device execution.
- TileLang printing now uses the new intrinsics to produce device-side asserts.

coderabbitai · 2025-10-27T11:45:13Z

Walkthrough

Adds two TL intrinsics (tl.device_assert, tl.device_assert_with_msg), registers them in the TL builtin registry, emits device-side assert/printf code in CUDA codegen, removes prior device-side utility implementations, and updates tilelang printing to call the new intrinsics.

Changes

Cohort / File(s)	Summary
TL Builtin Declarations `src/op/builtin.h`	Added declarations `TVM_DLL const Op &device_assert()` and `TVM_DLL const Op &device_assert_with_msg()` in `tvm::tl`.
TL Builtin Implementations & Registrations `src/op/builtin.cc`	Implemented and registered `tl.device_assert` (arity 1) and `tl.device_assert_with_msg` (arity 2) as opaque intrinsics in the TL builtin registry.
CUDA Codegen: Declaration `src/target/codegen_cuda.h`	Added private method declaration `void PrintDeviceAssert(const CallNode *call)` to `CodeGenTileLangCUDA`.
CUDA Codegen: Implementation `src/target/codegen_cuda.cc`	Added `PrintDeviceAssert(const CallNode*)` and integrated handling in the EvaluateNode visitor to emit optional printf and device `assert(cond)` for the new intrinsics, short-circuiting default CodeGenC emission.
Removed Device Utilities `src/tl_templates/cuda/debug.h`	Removed prior `TL_DEVICE void device_assert(bool)` and `TL_DEVICE void device_assert_with_msg(bool, const char*)` definitions.
TileLang Printing `tilelang/language/print.py`	Replaced `tir.call_extern(...)` usage with `T.call_intrin('tl.device_assert', ...)` and `T.call_intrin('tl.device_assert_with_msg', ...)` for assertion emission.

Sequence Diagram(s)

sequenceDiagram
    participant TLPrint as tilelang/print.py
    participant IR as TVM IR
    participant Codegen as CodeGenTileLangCUDA
    participant GPU as CUDA device

    alt no message
        TLPrint->>IR: T.call_intrin('tl.device_assert', cond)
    else with message
        TLPrint->>IR: T.call_intrin('tl.device_assert_with_msg', cond, msg)
    end

    IR->>IR: builtin ops registered (src/op/builtin.cc)
    IR->>Codegen: EvaluateNode visits CallNode
    Note over Codegen: matches tl.device_assert / tl.device_assert_with_msg
    Codegen->>Codegen: PrintDeviceAssert(call)  %%# (fill:#f7fbff)
    Codegen->>GPU: emit optional printf(msg) and assert(cond)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Files needing extra attention:
- src/target/codegen_cuda.cc — correctness and escaping of emitted CUDA code, message handling, and visitor short-circuit.
- src/op/builtin.cc — registration attributes (opaque effect, arity).
- tilelang/language/print.py — correct intrinsic call formation and argument ordering.

Possibly related PRs

[Feature]:Add device assert #1116 — Likely overlaps in implementing device-side assertions and related template/util changes.

Suggested reviewers

LeiWang1999

Poem

🐰 I hopped through ops and CUDA night,
Tiny asserts now spring to light.
From Python call to GPU's shout,
Messages printed, bugs called out—
A carrot-coded cheer, so bright! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "[Refactor]:Move device_assert from extern_call to intrin_call" accurately captures the main change across the changeset. The PR refactors device assertion functionality by converting from extern function calls to intrinsic operations, which is evidenced by: (1) adding new intrinsic definitions in builtin files, (2) removing old extern-based implementations from debug.h, (3) updating call sites in print.py to use intrinsics instead of extern calls, and (4) implementing CUDA codegen support for the new intrinsics. The title is concise, clear, and specific enough that a teammate reviewing history would immediately understand the refactoring's purpose, though it represents both device_assert and device_assert_with_msg which are handled as a cohesive change.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-10-27T11:45:16Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 307778e and 2f93c5a.

📒 Files selected for processing (6)

src/op/builtin.cc (1 hunks)
src/op/builtin.h (2 hunks)
src/target/codegen_cuda.cc (2 hunks)
src/target/codegen_cuda.h (1 hunks)
src/tl_templates/cuda/debug.h (0 hunks)
tilelang/language/print.py (2 hunks)

💤 Files with no reviewable changes (1)

src/tl_templates/cuda/debug.h

🚧 Files skipped from review as they are similar to previous changes (1)

src/target/codegen_cuda.h

🧰 Additional context used

🧬 Code graph analysis (3)

src/op/builtin.h (1)

tilelang/language/print.py (1)

device_assert (144-155)

tilelang/language/print.py (1)

tilelang/language/tir/op.py (1)

call_intrin (120-145)

src/op/builtin.cc (1)

tilelang/language/print.py (1)

device_assert (144-155)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)

🔇 Additional comments (5)

tilelang/language/print.py (2)

8-8: LGTM: Import added for intrinsic call support.

The import of tilelang.language as T is necessary for the intrinsic-based device assertion calls.

151-155: LGTM: Correct migration to intrinsic-based assertions.

The switch from tir.call_extern to T.call_intrin with registered intrinsics (tl.device_assert and tl.device_assert_with_msg) is correct and aligns with the PR objective.

src/op/builtin.h (1)

517-529: LGTM: Intrinsic declarations are correct.

The declarations for device_assert() and device_assert_with_msg() follow the established pattern for TL builtin intrinsics. Documentation is clear and concise.

src/target/codegen_cuda.cc (1)

2313-2320: LGTM: Correct interception of device assert intrinsics.

The modification to VisitStmt_ correctly identifies device assertion intrinsics by name and routes them to the specialized PrintDeviceAssert handler, preventing them from being processed by the standard codegen path.

src/op/builtin.cc (1)

304-312: LGTM: Intrinsic registrations are correct.

Both device_assert and device_assert_with_msg are properly registered with the correct number of inputs (1 and 2 respectively) and appropriate kOpaque call effect kind, which is correct for operations with observable side effects.

coderabbitai · 2025-10-28T05:42:48Z

src/target/codegen_cuda.cc

+void CodeGenTileLangCUDA::PrintDeviceAssert(const CallNode *call) {
+  std::string cond = PrintExpr(call->args[0]);
+  if (call->args.size() == 2) {
+    if (auto str = call->args[1].as<StringImmNode>()) {
+      this->PrintIndent();
+      stream << "if (!(" << cond << ")) printf(\"" << str->value << "\\n\");\n";
+    }
+  }
+  this->PrintIndent();
+  stream << "assert(" << cond << ");\n";
+}


⚠️ Potential issue | 🟡 Minor

Verify printf format string safety.

The implementation correctly emits device-side assertions. However, on line 2292, the message is passed directly to printf without format specifiers. If the message contains % characters, this could cause undefined behavior or crashes.

Consider using a format specifier to safely print the message:

- stream << "if (!(" << cond << ")) printf(\"" << str->value << "\\n\");\n"; + stream << "if (!(" << cond << ")) printf(\"%s\\n\", \"" << str->value << "\");\n";

Or escape any % characters in the message string before emitting it.

🤖 Prompt for AI Agents

In src/target/codegen_cuda.cc around lines 2287-2297, the emitted device printf uses the message string directly which is unsafe if the message contains '%' characters; change the emission to use a format specifier (e.g. printf("%s\n", ...)) or escape any '%' characters in the StringImmNode value before embedding it so the runtime printf never interprets stray '%' sequences; update the branch that handles call->args[1] to either print with "%s\\n" and pass the literal as the second argument or preprocess the literal to replace each '%' with "%%" prior to emitting the printf call.

yyttt6 force-pushed the main branch from c61354b to 307778e Compare October 28, 2025 05:17

update

2f93c5a

yyttt6 force-pushed the main branch from 307778e to 2f93c5a Compare October 28, 2025 05:39

coderabbitai bot reviewed Oct 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Refactor]:Move device_assert from extern_call to intrin_call #1134

[Refactor]:Move device_assert from extern_call to intrin_call #1134

Uh oh!

yyttt6 commented Oct 27, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 27, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 27, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[Refactor]:Move device_assert from extern_call to intrin_call #1134

Are you sure you want to change the base?

[Refactor]:Move device_assert from extern_call to intrin_call #1134

Uh oh!

Conversation

yyttt6 commented Oct 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Oct 27, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yyttt6 commented Oct 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 27, 2025 •

edited

Loading