Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
BasedOnStyle: LLVM
UseTab: Never
IndentWidth: 2
ColumnLimit: 80

Language: Cpp
Standard: c++17
5 changes: 4 additions & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ insert_final_newline = true
indent_size = 4

[*.{cpp,hpp,cxx,cc,c,h,cu,cuh}]
indent_size = 4
indent_size = 2

[{*.cmake,CMakeLists.txt}]
indent_size = 2

[*.{yaml,yml}]
indent_size = 2
Expand Down
9 changes: 9 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1,10 @@
* text eol=lf
*.bat eol=crlf

*.svg binary
*.jpg binary
*.jpeg binary
*.png binary
*.gif binary

*.h linguist-language=C++
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,14 @@ nnfusion.tar.gz
# makeenv and test intermediate files
tmp/

.env
.envrc
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
.vscode/
.vs/

Expand Down
60 changes: 60 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
ci:
autofix_prs: true
autofix_commit_msg: "[Lint]: [pre-commit.ci] auto fixes [...]"
autoupdate_commit_msg: "[CI] [pre-commit.ci] autoupdate"
autoupdate_schedule: monthly
default_stages: [pre-commit, pre-push, manual]
exclude: '^(build|3rdparty)/.*$' # exclude build and 3rdparty directories
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: check-symlinks
- id: destroyed-symlinks
# FIXME: enable these hooks
# - id: trailing-whitespace
# - id: end-of-file-fixer
- id: check-added-large-files
- id: check-merge-conflict
fail_fast: true
# FIXME: enable these hooks
# - id: check-executables-have-shebangs
# - id: check-shebang-scripts-are-executable
- id: detect-private-key
- id: check-yaml
- id: check-toml
- id: check-ast
fail_fast: true
- id: debug-statements
- repo: https://github.com/pre-commit/mirrors-clang-format
rev: v15.0.7 # sync with requirements-lint.txt
hooks:
- id: clang-format
exclude: |
(?ix)(
^.+\.(cu|cuh)$|
^.+\.json$
)
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.14.0 # sync with requirements-lint.txt
hooks:
- id: ruff-check
args: [--fix, --exit-non-zero-on-fix]
- repo: https://github.com/google/yapf
rev: v0.43.0 # sync with requirements-lint.txt
hooks:
- id: yapf
args: [--recursive, --in-place]
- repo: https://github.com/codespell-project/codespell
rev: v2.4.1 # sync with requirements-lint.txt
hooks:
- id: codespell
additional_dependencies: [".[toml]"]
exclude: |
(?x)(
^.+\.(cpp|hpp|cxx|cc|c|h|cu|cuh)$|
^.+\.svg$|
^.*\brequirements\b.*\.txt$
)
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ else()

# Set default build type to RelWithDebInfo if not provided
if(NOT CMAKE_BUILD_TYPE)
# Set default build type to Release if not provided
# Set default build type to Release if not provided
set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
message(STATUS "Setting default build type to ${CMAKE_BUILD_TYPE}")
endif()
Expand Down Expand Up @@ -199,7 +199,7 @@ if(USE_CUDA)
set(CUDA_MAJOR_VERSION ${CUDAToolkit_VERSION_MAJOR})
message(STATUS "Setting CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION}")
add_compile_definitions(CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION})

list(APPEND TILE_LANG_INCLUDES ${CUDAToolkit_INCLUDE_DIRS})
endif(USE_CUDA)

Expand Down
94 changes: 75 additions & 19 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,19 @@

That would be awesome if you want to contribute something to TileLang!

- [Contributing](CONTRIBUTING.md#contributing)
- [Reporting Bugs](CONTRIBUTING.md#reporting-bugs)
- [Asking Questions](CONTRIBUTING.md#asking-questions)
- [Submitting Pull Requests](CONTRIBUTING.md#submitting-pull-requests)
- [Repository Setup](CONTRIBUTING.md#repository-setup)
- [Running Tests](CONTRIBUTING.md#running-tests)
### Table of Contents <!-- omit in toc --> <!-- markdownlint-disable heading-increment -->

## Reporting Bugs
- [Report Bugs](#report-bugs)
- [Ask Questions](#ask-questions)
- [Submit Pull Requests](#submit-pull-requests)
- [Setup Development Environment](#setup-development-environment)
- [Install Develop Version](#install-develop-version)
- [Lint Check](#lint-check)
- [Test Locally](#test-locally)
- [Build Wheels](#build-wheels)
- [Documentation](#documentation)

## Report Bugs

If you run into any weird behavior while using TileLang, feel free to open a new issue in this repository! Please run a **search before opening** a new issue, to make sure that someone else hasn't already reported or solved the bug you've found.

Expand All @@ -18,35 +23,86 @@ Any issue you open must include:
- Code snippet that reproduces the bug with a minimal setup.
- A clear explanation of what the issue is.


## Asking Questions
## Ask Questions

Please ask questions in issues.

## Submitting Pull Requests
## Submit Pull Requests

All pull requests are super welcomed and greatly appreciated! Issues in need of a solution are marked with a [`♥ help`](https://github.com/ianstormtaylor/TileLang/issues?q=is%3Aissue+is%3Aopen+label%3A%22%E2%99%A5+help%22) label if you're looking for somewhere to start.

Please run `./format.sh` before submitting a pull request to make sure that your code is formatted correctly.
If you're new to contributing to TileLang, you can follow the following guidelines before submitting a pull request.

> [!NOTE]
> Please include tests and docs with every pull request if applicable!

## Setup Development Environment

Before contributing to TileLang, please follow the instructions below to setup.

1. Fork TileLang ([fork](https://github.com/tile-ai/tilelang/fork)) on GitHub and clone the repository.

```bash
git clone --recurse-submodules git@github.com:<your username>/tilelang.git # use the SSH protocol
cd tilelang

git remote add upstream git@github.com:tile-ai/tilelang.git
```

2. Setup a development environment:

```bash
uv venv --seed .venv # use `python3 -m venv .venv` if you don't have `uv`

source .venv/bin/activate
python3 -m pip install --upgrade pip setuptools wheel "build[uv]"
uv pip install --requirements requirements-dev.txt
```

3. Setup the [`pre-commit`](https://pre-commit.com) hooks:

```bash
pre-commit install --install-hooks
```

Please include tests and docs with every pull request!
Then you are ready to rock. Thanks for contributing to TileLang!

## Repository Setup
## Install Develop Version

To run the build, you need to have the TileLang repository cloned to your computer. After that, you need to `cd` into the directory where you cloned it, and install the dependencies with `python`:
To install TileLang in an "editable" mode, run:

```bash
python setup.py install
python3 -m pip install --no-build-isolation --verbose --editable .
```

in the main directory. This installation is removable by:

## Running Tests
```bash
python3 -m pip uninstall tilelang
```

## Lint Check

To check the linting, run:

```bash
pre-commit run --all-files
```

## Test Locally

To run the tests, start by building the project as described in the [Repository Setup](CONTRIBUTING.md#repository-setup) section.
To run the tests, start by building the project as described in the [Setup Development Environment](#setup-development-environment) section.

Then you can rerun the tests with:

```text
python -m pytest testing
```bash
python3 -m pytest testing
```

## Build Wheels

_TBA_

## Documentation

_TBA_
4 changes: 2 additions & 2 deletions docs/deeplearning_operators/matmul.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
:class: myclass1 myclass2
:name: a-tip-reference

This document is still **experimental** and may be incomplete.
This document is still **experimental** and may be incomplete.
Suggestions and improvements are highly encouraged—please submit a PR!
:::

Expand Down Expand Up @@ -256,4 +256,4 @@ For more advanced usage—including partial lowering, explicitly controlling thr
* [BitBLAS](https://github.com/tile-ai/bitblas)
* [Triton](https://github.com/openai/triton)
* [Cutlass](https://github.com/NVIDIA/cutlass)
* [PyCUDA](https://documen.tician.de/pycuda/)
* [PyCUDA](https://documen.tician.de/pycuda/) <!-- codespell:ignore -->
2 changes: 2 additions & 0 deletions examples/deepseek_v32/fp8_lighting_indexer.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,7 @@ def ref_fp8_mqa_logits(q: torch.Tensor, kv: torch.Tensor, weights: torch.Tensor,
cost = mask.sum()
return logits, cost


def test_fp8_lighting_indexer(S=4096, SKV=8192, H=32, HKV=1, D=64, kv_stride=1):
q = torch.randn(S, H, D, device="cuda", dtype=torch.bfloat16).to(torch.bfloat16)
kv = torch.randn(SKV, D, device="cuda", dtype=torch.bfloat16).to(torch.bfloat16)
Expand Down Expand Up @@ -302,5 +303,6 @@ def logits_fn():
print(f"logits_tflops: {logits_tflops}, logits_ms: {logits_ms}")
print(f"cost_ref: {cost_ref}")


if __name__ == "__main__":
test_fp8_lighting_indexer()
11 changes: 10 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@ skip = [
".venv"
]

[tool.ruff]
target-version = "py38"
line-length = 100
output-format = "full"

[tool.ruff.lint]
select = [
# pycodestyle
Expand All @@ -48,13 +53,17 @@ ignore = [
"E741",
# line too long
"E501",
# if-else-block instead of ternary
"SIM108",
# key in dict.keys()
"SIM118",
# memory leaks
"B019",
# zip without explicit strict
"B905",
# No such file or directory
"E902",
]
[tool.ruff.lint.per-file-ignores]
"3rdparty/**/*" = ["ALL"]
"examples/deepseek_v32/inference/**/*" = ["ALL"]
"examples/deepseek_v32/inference/**/*" = ["ALL"]
9 changes: 4 additions & 5 deletions requirements-lint.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# formatting
yapf==0.40.2
toml==0.10.2
tomli==2.0.1
ruff==0.6.5
codespell==2.3.0
pre-commit
yapf==0.43.0
ruff==0.14.0
codespell[toml]==2.4.1
clang-format==15.0.7
clang-tidy==18.1.8
8 changes: 4 additions & 4 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -417,7 +417,7 @@ def patch_libs(libpath):
subprocess.run([patchelf_path, '--set-rpath', '$ORIGIN', libpath])


class TileLangBuilPydCommand(build_py):
class TileLangBuildPyCommand(build_py):
"""Customized setuptools install command - builds TVM after setting up LLVM."""

def run(self):
Expand Down Expand Up @@ -643,7 +643,7 @@ def __init__(self, name, sourcedir=""):
self.sourcedir = os.path.abspath(sourcedir)


class TilelangExtensionBuild(build_ext):
class TileLangExtensionBuild(build_ext):
"""
Custom build_ext command for CMake-based projects.

Expand Down Expand Up @@ -929,8 +929,8 @@ def build_cmake(self, ext):
CythonExtension("TileLangCython", sourcedir="."),
],
cmdclass={
"build_py": TileLangBuilPydCommand,
"build_py": TileLangBuildPyCommand,
"sdist": TileLangSdistCommand,
"build_ext": TilelangExtensionBuild,
"build_ext": TileLangExtensionBuild,
},
)
2 changes: 1 addition & 1 deletion src/layout/gemm_layouts.cc
Original file line number Diff line number Diff line change
Expand Up @@ -588,7 +588,7 @@ Layout makeGemmVoltaABLayout(int stride, int continuous, bool is_a,

// ref:
// https://github.com/nvidia/cutlass/blob/ad7b2f5e84fcfa124cb02b91d5bd26d238c0459e/include/cutlass/layout/tensor_op_multiplicand_sm75.h#L54
// Althought the four settings (T or NT) used distinct layouts in CUTLASS, they
// Although the four settings (T or NT) used distinct layouts in CUTLASS, they
// appeared to result in the same mem layout
Layout makeTensorOpMultiplicand(int mat_stride, int mat_continuous,
int elementsize, int crosswise) {
Expand Down
4 changes: 2 additions & 2 deletions src/op/parallel.cc
Original file line number Diff line number Diff line change
Expand Up @@ -215,9 +215,9 @@ LayoutMap ParallelOpNode::InferLayout(const LayoutInferArgs &T,
return {};
if (level == InferLevel::kStrict) {
LayoutMap results;
// Deduce buffers that shoule be complicated replicated.
// Deduce buffers that should be complicated replicated.
// For example:
// for i in T.Parllel(m):
// for i in T.Parallel(m):
// fragment[0] = x[i]
// then fragment[0] must be replicated on all threads.
for (const auto &[buffer, indices] : indice_map_) {
Expand Down
2 changes: 1 addition & 1 deletion src/target/codegen_cuda.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2210,7 +2210,7 @@ void CodeGenTileLangCUDA::VisitExpr_(const BufferLoadNode *op,
DataType element_dtype = op->buffer->dtype;

int lanes = op->dtype.lanes();
// delcare type.
// declare type.
if (value_dtype.lanes() == element_dtype.lanes()) {
std::string ref = GetBufferRef(op->dtype, op->buffer.get(), index);
HandleVolatileLoads(ref, op, os);
Expand Down
2 changes: 1 addition & 1 deletion src/target/ptx.h
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ std::string PrintArriveBarrierAsm(const std::string &barrier);
* \brief Print ptx barrier arrival with expect tx operation using
* mbarrier.arrive.expect_tx \param barrier: The name of the barrier in shared
* memory. \param byte_count: Increases the tx count of the mbarrier object to
* track completion of addtional async transactions.
* track completion of additional async transactions.
*/
std::string PrintArriveBarrierExpectTxAsm(const std::string &barrier,
const std::string &byte_count);
Expand Down
Loading
Loading