[Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ #785

HaoKang-Timmy · 2025-09-04T03:03:25Z

gemm_sm120.h is based on gemm_mma.h.
However, gemm_mma.h did not include intrin.h, which is essential to compile tilelang code on sm120 hardware.

Summary by CodeRabbit

Chores
- Updated CUDA GEMM template to include required intrinsics header for improved build compatibility across environments. No changes to functionality, performance, or public APIs.
- Enhances maintainability and reduces potential compiler warnings, with no impact on user-facing behavior or configuration.

To make sm120 arch runnable.

github-actions · 2025-09-04T03:03:34Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run bash format.sh in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work!

🚀

gemini-code-assist

Summary of Changes

Hello @HaoKang-Timmy, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical dependency by incorporating a missing header file into an existing template. This ensures that downstream components, specifically those targeting sm120 hardware, can compile correctly without errors related to undeclared intrinsics.

Highlights

Missing Header Inclusion: The pull request resolves a compilation issue for tilelang code targeting sm120 hardware by adding the necessary #include "intrin.h" to gemm_mma.h. This is crucial because gemm_sm120.h relies on gemm_mma.h, and intrin.h provides essential intrinsics for sm120 compilation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2025-09-04T03:03:46Z

Walkthrough

Added a single include directive for "intrin.h" in src/tl_templates/cuda/gemm_mma.h, placed after the existing include of "cuda_fp8.h". No other code, API, or control-flow changes.

Changes

Cohort / File(s)	Summary
Header include update `src/tl_templates/cuda/gemm_mma.h`	Inserted `#include "intrin.h"` following `#include "cuda_fp8.h"`. No other modifications.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

[Bugfix] Add missing FP8 header include #752 — Introduces intrin.h (and FP8 support headers), directly corresponding to the new include added in gemm_mma.h.

Poem

A nibble of code, a header to bring,
I twitch my nose at the simplest thing—
“intrin.h,” hop in the line,
Next to FP8, looking fine.
With gentle paws, I merge and grin,
Small includes make big wins. 🐇✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

gemini-code-assist

Code Review

This pull request adds the intrin.h header to gemm_mma.h to support compilation for sm120 hardware. The change is functionally correct and safe, as the contents of intrin.h are guarded by a preprocessor check for newer architectures. I have added one comment with a suggestion to potentially improve modularity by including the header in a more specific file, which could enhance long-term maintainability.

gemini-code-assist · 2025-09-04T03:04:34Z

src/tl_templates/cuda/gemm_mma.h


 #include "common.h"
 #include "cuda_fp8.h"
+#include "intrin.h"


For better modularity, consider including intrin.h directly in the files that require it (e.g., gemm_sm120.h, as mentioned in the PR description) rather than in this more general header. This approach helps to minimize header dependencies and makes it explicit which components rely on the SM90+ intrinsics. If gemm_mma.h itself doesn't use any symbols from intrin.h, moving this include would lead to a cleaner design.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/tl_templates/cuda/gemm_mma.h (1)
12-12: Include LGTM; consider guarding for SM120 and avoiding Windows intrin.h collision

The addition unblocks SM120 builds. Two optional tweaks:

Gate it to SM120 to reduce surface area on other arches.

Ensure we always include the local header (not MSVC’s intrin.h on Windows) by guarding or making the path explicit.
-#include "intrin.h"
+#if defined(__CUDA_ARCH_LIST__) && (__CUDA_ARCH_LIST__ >= 1200)
+#  include "intrin.h"  // SM120 intrinsics
+#endif
Please confirm Windows CI resolves this to the project’s intrin.h (and not the toolchain one). If collision is observed, switch to an explicit relative path.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 141e01f and 70694f1.

📒 Files selected for processing (1)

src/tl_templates/cuda/gemm_mma.h (1 hunks)

Add intrin.h include to gemm_mma.h

70694f1

To make sm120 arch runnable.

gemini-code-assist bot reviewed Sep 4, 2025

View reviewed changes

coderabbitai bot reviewed Sep 4, 2025

View reviewed changes

LeiWang1999 changed the title ~~Add intrin.h include to gemm_mma.h~~ [Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ Sep 4, 2025

LeiWang1999 merged commit 6e0c350 into tile-ai:main Sep 4, 2025
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ #785

[Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ #785

Uh oh!

HaoKang-Timmy commented Sep 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

coderabbitai bot commented Sep 4, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ #785

[Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ #785

Uh oh!

Conversation

HaoKang-Timmy commented Sep 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HaoKang-Timmy commented Sep 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 4, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)