Skip to content

Conversation

@HaoKang-Timmy
Copy link
Contributor

@HaoKang-Timmy HaoKang-Timmy commented Sep 4, 2025

gemm_sm120.h is based on gemm_mma.h.
However, gemm_mma.h did not include intrin.h, which is essential to compile tilelang code on sm120 hardware.

Summary by CodeRabbit

  • Chores
    • Updated CUDA GEMM template to include required intrinsics header for improved build compatibility across environments. No changes to functionality, performance, or public APIs.
    • Enhances maintainability and reduces potential compiler warnings, with no impact on user-facing behavior or configuration.

To make sm120 arch runnable.
@github-actions
Copy link

github-actions bot commented Sep 4, 2025

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run bash format.sh in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work!

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @HaoKang-Timmy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical dependency by incorporating a missing header file into an existing template. This ensures that downstream components, specifically those targeting sm120 hardware, can compile correctly without errors related to undeclared intrinsics.

Highlights

  • Missing Header Inclusion: The pull request resolves a compilation issue for tilelang code targeting sm120 hardware by adding the necessary #include "intrin.h" to gemm_mma.h. This is crucial because gemm_sm120.h relies on gemm_mma.h, and intrin.h provides essential intrinsics for sm120 compilation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 4, 2025

Walkthrough

Added a single include directive for "intrin.h" in src/tl_templates/cuda/gemm_mma.h, placed after the existing include of "cuda_fp8.h". No other code, API, or control-flow changes.

Changes

Cohort / File(s) Summary
Header include update
src/tl_templates/cuda/gemm_mma.h
Inserted #include "intrin.h" following #include "cuda_fp8.h". No other modifications.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Poem

A nibble of code, a header to bring,
I twitch my nose at the simplest thing—
“intrin.h,” hop in the line,
Next to FP8, looking fine.
With gentle paws, I merge and grin,
Small includes make big wins. 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds the intrin.h header to gemm_mma.h to support compilation for sm120 hardware. The change is functionally correct and safe, as the contents of intrin.h are guarded by a preprocessor check for newer architectures. I have added one comment with a suggestion to potentially improve modularity by including the header in a more specific file, which could enhance long-term maintainability.


#include "common.h"
#include "cuda_fp8.h"
#include "intrin.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better modularity, consider including intrin.h directly in the files that require it (e.g., gemm_sm120.h, as mentioned in the PR description) rather than in this more general header. This approach helps to minimize header dependencies and makes it explicit which components rely on the SM90+ intrinsics. If gemm_mma.h itself doesn't use any symbols from intrin.h, moving this include would lead to a cleaner design.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/tl_templates/cuda/gemm_mma.h (1)

12-12: Include LGTM; consider guarding for SM120 and avoiding Windows intrin.h collision

The addition unblocks SM120 builds. Two optional tweaks:

  • Gate it to SM120 to reduce surface area on other arches.
  • Ensure we always include the local header (not MSVC’s intrin.h on Windows) by guarding or making the path explicit.
-#include "intrin.h"
+#if defined(__CUDA_ARCH_LIST__) && (__CUDA_ARCH_LIST__ >= 1200)
+#  include "intrin.h"  // SM120 intrinsics
+#endif

Please confirm Windows CI resolves this to the project’s intrin.h (and not the toolchain one). If collision is observed, switch to an explicit relative path.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 141e01f and 70694f1.

📒 Files selected for processing (1)
  • src/tl_templates/cuda/gemm_mma.h (1 hunks)

@LeiWang1999 LeiWang1999 changed the title Add intrin.h include to gemm_mma.h [Nvidia][SM121] Add intrin.h include to gemm_mma.h for sm120+ Sep 4, 2025
@LeiWang1999 LeiWang1999 merged commit 6e0c350 into tile-ai:main Sep 4, 2025
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants