Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ repos:
- id: check-ast
fail_fast: true
- id: debug-statements
- id: file-contents-sorter
args: [--ignore-case]
files: ^docs/spelling_wordlist\.txt$
- repo: https://github.com/pre-commit/mirrors-clang-format
rev: v15.0.7 # sync with requirements-lint.txt
hooks:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,6 @@ Welcome to join our Discord community for discussions, support, and collaboratio

[![Join our Discord](https://img.shields.io/badge/Discord-Join%20Us-blue?logo=discord&style=for-the-badge)](https://discord.gg/TUrHyJnKPG)

## Acknowledgements
## Acknowledgments

We would like to express our gratitude to the [TVM](https://github.com/apache/tvm) community for their invaluable contributions. The initial version of this project was mainly developed by [LeiWang1999](https://github.com/LeiWang1999), [chengyupku](https://github.com/chengyupku) and [nox-410](https://github.com/nox-410) with supervision from Prof. [Zhi Yang](https://yangzhihome.github.io) at Peking University. Part of this work was carried out during an internship at Microsoft Research, where Dr. Lingxiao Ma, Dr. Yuqing Xia, Dr. Jilong Xue, and Dr. Fan Yang offered valuable advice and support. We deeply appreciate their mentorship and contributions.
2 changes: 1 addition & 1 deletion docs/compiler_internals/inject_fence_proxy.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## Why Fences Are Needed

Hopper separates memory instructions into generic and asynchronous proxy paths. When an asynchronous instruction (for example, `cp.async` or `tma.load`) issues after generic traffic (like `ldmatrix` or plain buffer stores), the hardware requires a `fence.proxy.async` to guarantee ordering. Missing fences can lead to race conditions or undefined behaviour.
Hopper separates memory instructions into generic and asynchronous proxy paths. When an asynchronous instruction (for example, `cp.async` or `tma.load`) issues after generic traffic (like `ldmatrix` or plain buffer stores), the hardware requires a `fence.proxy.async` to guarantee ordering. Missing fences can lead to race conditions or undefined behavior.

## What the Pass Does

Expand Down
8 changes: 8 additions & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
cancelled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the inclusion of "cancelled" in the ignore list.

The PR aims to standardize on American English spelling, yet "cancelled" (British spelling) is added to the ignore list instead of being changed to "canceled" (American spelling). Please verify whether "cancelled" appears in contexts that cannot be changed (e.g., external dependencies, quoted text, API names).


🏁 Script executed:

#!/bin/bash
# Description: Verify usage of "cancelled" and "canceled" in the codebase.
# Expected: Understand why "cancelled" is in the ignore list.

echo "=== Searching for 'cancelled' (British spelling) ==="
rg -n -C3 --type-add 'code:*.{py,md,rst,txt,yaml,toml}' --type code 'cancelled'

echo -e "\n=== Searching for 'canceled' (American spelling) ==="
rg -n -C3 --type-add 'code:*.{py,md,rst,txt,yaml,toml}' --type code 'canceled'

Length of output: 244


Replace ‘cancelled’ with American ‘canceled’ in docs/spelling_wordlist.txt to align with the project’s American English standard.

🤖 Prompt for AI Agents
In docs/spelling_wordlist.txt around lines 1 to 1, the word "cancelled" uses
British spelling; replace it with the American spelling "canceled" (single L)
and save the file so the project’s American English spelling standard is
enforced.

hsa
ist
LOD
nd
NotIn
offen
te
4 changes: 2 additions & 2 deletions examples/bitnet-1.58b/modeling_bitnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -1718,11 +1718,11 @@ def forward(
) -> Union[Tuple, QuestionAnsweringModelOutput]:
r"""
start_positions (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
Labels for position (index) of the start of the labelled span for computing the token classification loss.
Labels for position (index) of the start of the labeled span for computing the token classification loss.
Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
are not taken into account for computing the loss.
end_positions (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
Labels for position (index) of the end of the labelled span for computing the token classification loss.
Labels for position (index) of the end of the labeled span for computing the token classification loss.
Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
are not taken into account for computing the loss.
"""
Expand Down
6 changes: 3 additions & 3 deletions examples/bitnet-1.58b/tokenization_bitnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,9 +170,9 @@ def __init__(

if legacy is None:
logger.warning_once(
f"You are using the default legacy behaviour of the {self.__class__}. This is"
f"You are using the default legacy behavior of the {self.__class__}. This is"
" expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you."
" If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it"
" If you want to use the new behavior, set `legacy=False`. This should only be set if you understand what it"
" means, and thoroughly read the reason why this was added as explained in"
" https://github.com/huggingface/transformers/pull/24565")
legacy = True
Expand Down Expand Up @@ -215,7 +215,7 @@ def get_spm_processor(self, from_slow=False):
with open(self.vocab_file, "rb") as f:
sp_model = f.read()
model_pb2 = import_protobuf(
f"The new behaviour of {self.__class__.__name__} (with `self.legacy = False`)")
f"The new behavior of {self.__class__.__name__} (with `self.legacy = False`)")
model = model_pb2.ModelProto.FromString(sp_model)
normalizer_spec = model_pb2.NormalizerSpec()
normalizer_spec.add_dummy_prefix = False
Expand Down
7 changes: 4 additions & 3 deletions examples/deepseek_mla/amd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Key implementation differences between Hopper and MI300X architectures include:
# Original shared memory allocation
Q_shared = T.alloc_shared([block_H, dim], dtype)
Q_pe_shared = T.alloc_shared([block_H, pe_dim], dtype)

# Optimized register allocation
Q_local = T.alloc_fragment([block_H, dim], dtype)
Q_pe_local = T.alloc_fragment([block_H, pe_dim], dtype)
Expand Down Expand Up @@ -47,5 +47,6 @@ Notably, TileLang achieves performance parity with hand-optimized assembly kerne
- Improve compute-to-memory access ratios
- Enhance parallelism through dimension-wise task distribution

## Acknowledgement
We would like to express our sincere gratitude to the AMD ROCm and Composable Kernel team for their outstanding contributions. We have learned a great deal from the ROCm software stack.
## Acknowledgment

We would like to express our sincere gratitude to the AMD ROCm and Composable Kernel team for their outstanding contributions. We have learned a great deal from the ROCm software stack.
5 changes: 3 additions & 2 deletions examples/gdn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@

The [chunk_delta_h](common/chunk_delta_h.py) implements the most critical forward kernel of GDN. It's a good start to understand the GDN logic and the TileLang optimization.

## Acknowledgements
This kernel was developed by Yu Cheng and Zhengju Tang following in-depth discussions with Xiaomi's LLM-Core Team (MiMo).
## Acknowledgments

This kernel was developed by Yu Cheng and Zhengju Tang following in-depth discussions with Xiaomi's LLM-Core Team (MiMo).
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ column_limit = 100
indent_width = 4

[tool.codespell]
ignore-words-list = "nd, te, ist, LOD, offen, NotIn, HSA"
builtin = "clear,rare,en-GB_to_en-US"
ignore-words = "docs/spelling_wordlist.txt"
skip = [
"build",
"3rdparty",
Expand Down
2 changes: 1 addition & 1 deletion tilelang/language/overrides/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""TileLang-specific runtime overrides.

Importing this package registers custom handlers that extend or override
behaviour from upstream TVMScript for TileLang semantics.
behavior from upstream TVMScript for TileLang semantics.
"""

# Register parser overrides upon import.
Expand Down
Loading