Skip to content

Conversation

@doraemonmj
Copy link

@doraemonmj doraemonmj commented Feb 12, 2026

Add Multi-Kernel Fuzzing Framework for PyPTO

Summary

Adds an automated fuzzing framework for generating and testing multi-kernel PyPTO programs with random operator combinations.

What's New

  • Fuzzing Framework (tests/st/fuzz/)

    • OpFuzzer: Generates random operation chains with 15+ operators (add, mul, log, exp, etc.)
    • KernelGenerator: Creates InCore kernel functions with configurable inputs
    • OrchestratorGenerator: Supports 3 composition modes (sequential, branching, mixed)
    • MultiKernelTestGenerator: Generates complete test cases with PyTorch golden reference
  • Configuration-Based Generation

    • Define test configs in example_multi_kernel.py
    • Control: kernel count, op count, tensor shapes, init types, random seeds
    • Generate multiple test instances from single config
  • Enhanced Error Reporting

    • Shows first 10 mismatched values with actual/expected/diff
    • Displays error statistics (max/mean absolute/relative errors)

Usage

# Generate tests
python tests/st/fuzz/example_multi_kernel.py

# Run tests
pytest tests/st/fuzz/generated/test_fuzz_multi_kernel.py -v

0213

本 PR 在 tests/st/fuzz/ 目录下引入一个独立的模糊测试框架,用于自动生成和验证由多个 InCore 内核组成的 PyPTO 程序。
主要功能:
随机生成符合硬件约束(32 字节对齐、单一管道类型)的内核函数;
支持 sequential、branching 和 mixed 三种内核组合模式;
自动生成 PyTorch 参考实现,并支持配置 atol/rtol 容差;
所有逻辑自包含,不依赖外部 fuzz 工具。
框架通过 example_multi_kernel.py 配置测试用例,生成的代码位于 generated/ 目录,可直接由 pytest 执行。

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @doraemonmj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求引入了一个全面的多内核模糊测试框架,旨在自动化PyPTO程序的生成和验证。该框架能够创建包含随机操作组合的InCore内核,并通过顺序、分支或混合模式进行编排。通过配置驱动的测试生成和PyTorch作为黄金参考实现,它显著提升了PyPTO测试的覆盖范围和可靠性。

Highlights

  • 新增多内核模糊测试框架: 引入了一个自动化框架,用于生成和测试多内核PyPTO程序,支持随机算子组合。
  • 核心组件: 框架包含OpFuzzer(算子链生成)、KernelGenerator(InCore内核生成)、OrchestratorGenerator(组合函数生成)和MultiKernelTestGenerator(完整测试用例生成)。
  • 配置化生成: 测试用例的生成现在可以通过配置文件进行精细控制,包括内核数量、操作范围、张量形状和初始化类型。
  • 增强错误报告: 改进了错误报告机制,能够显示前10个不匹配值以及详细的误差统计(最大/平均绝对/相对误差)。
  • PyTorch黄金参考: 所有黄金数据生成已从NumPy迁移到PyTorch,以提高参考实现的准确性和兼容性。
  • 命令行参数简化: 移除了旧的命令行参数,所有配置现在通过统一的all_configs结构进行管理,仅保留输出路径和误差容忍度参数。

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • tests/st/fuzz/README.md
    • 新增了多内核模糊测试框架的详细文档,包括快速开始、代码结构、核心概念、配置指南、算子规则、使用示例、更新日志和注意事项。
  • tests/st/fuzz/init.py
    • 新增了模糊测试框架的包初始化文件,导出了核心生成器类。
  • tests/st/fuzz/conftest.py
    • 新增了pytest配置文件,继承了父级测试配置。
  • tests/st/fuzz/example_multi_kernel.py
    • 新增了用于生成多内核模糊测试用例的示例脚本,支持命令行参数配置。
  • tests/st/fuzz/src/init.py
    • 新增了模糊测试框架内部实现模块的初始化文件。
  • tests/st/fuzz/src/fuzzer.py
    • 新增了OpFuzzer类,用于生成随机算子链,并包含了形状对齐检查和NumPy参考实现逻辑。
  • tests/st/fuzz/src/kernel_generator.py
    • 新增了KernelGenerator类,用于生成带有随机操作链的InCore内核函数。
  • tests/st/fuzz/src/mutile_kernel_test_generator.py
    • 新增了MultiKernelTestGenerator类,用于生成完整的PyPTO测试用例,包括内核、编排函数和PyTorch黄金参考实现。
  • tests/st/fuzz/src/orchestrator_generator.py
    • 新增了OrchestratorGenerator类,用于生成顺序、分支和混合模式的Orchestration函数。
  • tests/st/harness/adapters/golden_generator.py
    • 更新了黄金参考生成器,使其能够根据compute_expected方法的内容,有条件地导入torch库。
Activity
  • 此拉取请求自创建以来,尚未有额外的评论或评审活动。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@doraemonmj doraemonmj changed the title [WIP] Fuzz [WIP] test: Fuzz Feb 12, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这个 PR 引入了一个功能全面的多内核模糊测试框架,为项目的测试能力带来了很好的补充。代码结构清晰,将不同的生成器类进行了划分。我发现了一些问题,包括一个关键的文件名拼写错误、一个合并内核形状处理的 bug,以及一些可以清理的死代码和不一致之处。修复这些问题后,这个框架将更加健壮和易于维护。

if str(_TESTS_ST_DIR) not in sys.path:
sys.path.insert(0, str(_TESTS_ST_DIR))

from fuzz.src.mutile_kernel_test_generator import MultiKernelTestGenerator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

此导入语句源于一个文件名拼写错误。mutile_kernel_test_generator.py 应为 multi_kernel_test_generator.py。请在修正文件名后更新此处的导入路径。

Suggested change
from fuzz.src.mutile_kernel_test_generator import MultiKernelTestGenerator
from fuzz.src.multi_kernel_test_generator import MultiKernelTestGenerator


from .fuzzer import OpFuzzer, OpSpec
from .kernel_generator import KernelGenerator
from .mutile_kernel_test_generator import MultiKernelTestGenerator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

此导入语句源于一个文件名拼写错误。mutile_kernel_test_generator.py 应为 multi_kernel_test_generator.py。请在修正文件名后更新此处的导入路径。

Suggested change
from .mutile_kernel_test_generator import MultiKernelTestGenerator
from .multi_kernel_test_generator import MultiKernelTestGenerator

@@ -0,0 +1,788 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

文件名 mutile_kernel_test_generator.py 存在拼写错误,应为 multi_kernel_test_generator.py。此问题影响了多个文件中的导入语句 (fuzz/src/__init__.py, example_multi_kernel.py) 和文档 (README.md)。建议全局修正以保证项目的一致性。


# 添加合并内核(如果需要)
if orch_info.get("needs_merge_kernel", False):
merge_code = self.orch_gen.generate_merge_kernel(shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

branchingmixed 模式下,当需要合并内核时,generate_merge_kernel 方法被调用时传入了默认的 shape 参数,而不是 orch_info 中计算出的、统一后的 output_shape。这可能导致在合并内核(merge kernel)中因形状不匹配而产生错误。orch_info 字典中已经包含了正确的输出形状,应使用它来生成合并内核。

Suggested change
merge_code = self.orch_gen.generate_merge_kernel(shape)
merge_code = self.orch_gen.generate_merge_kernel(orch_info["output_shape"])

│ ├── fuzzer.py # OpFuzzer 核心逻辑
│ ├── kernel_generator.py # InCore 内核生成器
│ ├── orchestrator_generator.py # Orchestration 组合函数生成器
│ └── mutile_kernel_test_generator.py # 完整测试用例生成器
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

文件名存在拼写错误,mutile_kernel_test_generator.py 应为 multi_kernel_test_generator.py

Suggested change
│ └── mutile_kernel_test_generator.py # 完整测试用例生成器
│ └── multi_kernel_test_generator.py # 完整测试用例生成器

"input_shapes_list": [
[(128, 128), (128, 128)], # kernel_0: 2个相同维度的输入
],
"description": "简单顺序执行:2个内核,相同维度输入",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此配置的描述是“简单顺序执行:2个内核,相同维度输入”,但配置中的 num_kernels 值为 1(在第82行)。请确保配置值和描述一致。

Suggested change
"description": "简单顺序执行:2个内核,相同维度输入",
"description": "简单顺序执行:1个内核,相同维度输入",

Comment on lines +501 to +556
def generate_numpy_reference(
self,
op_chain: List[Dict[str, Any]],
input_tensors: Dict[str, Any],
) -> Any:
"""Generate NumPy golden reference from operation chain."""
import numpy as np

# Create variable environment
env = {}
for name, tensor in input_tensors.items():
env[f"tile_{name}"] = tensor.copy()

# Execute operations
for op_dict in op_chain:
op = op_dict["op"]
inputs = op_dict["inputs"]
output = op_dict["output"]
params = op_dict.get("params")

# Get input values
input_vals = []
for inp in inputs:
if inp in env:
val = env[inp]
else:
val = float(inp)
input_vals.append(val)

# Apply constraints
if "avoid_zero" in op.constraints and op.constraints["avoid_zero"]:
for i, val in enumerate(input_vals):
if isinstance(val, np.ndarray):
input_vals[i] = np.where(np.abs(val) < 0.01, 1.0, val)

if "positive_only" in op.constraints and op.constraints["positive_only"]:
for i, val in enumerate(input_vals):
if isinstance(val, np.ndarray):
input_vals[i] = np.abs(val) + 1e-6

# Execute operation
if op.np_equivalent:
import inspect

sig = inspect.signature(op.np_equivalent)
if params and len(sig.parameters) > len(input_vals):
result = op.np_equivalent(*input_vals, params)
else:
result = op.np_equivalent(*input_vals)
env[output] = result

# Return final result
if op_chain:
return env[op_chain[-1]["output"]]
else:
return input_tensors[list(input_tensors.keys())[0]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

generate_numpy_reference 方法似乎在当前框架中没有被调用。由于参考实现已经迁移到 PyTorch,如果此方法确实不再使用,建议将其移除以简化代码库。

该模块负责生成完整的测试用例,包括:
- 多个 InCore 内核
- Orchestration 组合函数
- NumPy 参考实现
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

类的文档字符串中提到 “NumPy 参考实现”,但这与代码的实际行为不符,因为参考实现已经改用 PyTorch。请更新文档字符串以反映当前实现。

Suggested change
- NumPy 参考实现
- PyTorch 参考实现

Comment on lines +172 to +242
def _regenerate_kernel_code_with_unified_shapes(
self,
kernel: Dict[str, Any],
input_shapes_map: Dict[str, Tuple[int, int]],
) -> str:
"""使用统一的输入形状重新生成 kernel 代码

Args:
kernel: 内核信息字典
input_shapes_map: 统一的输入形状映射

Returns:
重新生成的 kernel 代码
"""
kernel_name = kernel["name"]
output_shape = kernel["output_shape"]
op_chain = kernel["op_chain"]
rows, cols = output_shape

# 使用统一的输入形状生成函数签名
params = []
for inp_name, _ in kernel["inputs"]:
unified_shape = input_shapes_map[inp_name]
params.append(f"{inp_name}: pl.Tensor[[{unified_shape[0]}, {unified_shape[1]}], pl.FP32]")
# 添加 output_tensor 参数
params.append(f"output: pl.Tensor[[{rows}, {cols}], pl.FP32]")

code_lines = [
" @pl.function(type=pl.FunctionType.InCore)",
f" def {kernel_name}(self, {', '.join(params)}) -> pl.Tensor[[{rows}, {cols}], pl.FP32]:",
]

# 加载输入张量 - 使用每个输入的实际定义形状
for inp_name, _ in kernel["inputs"]:
inp_shape = input_shapes_map[inp_name]
code_lines.append(
f" tile_{inp_name} = pl.load({inp_name}, offsets=[0, 0], shapes=[{inp_shape[0]}, {inp_shape[1]}])"
)

# 生成操作链
for op_dict in op_chain:
op = op_dict["op"]
inputs_str = ", ".join(op_dict["inputs"])
output = op_dict["output"]
params_dict = op_dict.get("params")

# 去掉 block. 前缀,直接使用 pl.xxx
op_name = op.name.replace("block.", "")

if params_dict:
params_str = ", ".join(f"{k}={v}" for k, v in params_dict.items())
code_lines.append(f" {output} = pl.{op_name}({inputs_str}, {params_str})")
else:
code_lines.append(f" {output} = pl.{op_name}({inputs_str})")

# Store 结果并返回
if op_chain:
last_output = op_chain[-1]["output"]
code_lines.append(
f" result = pl.store({last_output}, offsets=[0, 0], shapes=[{rows}, {cols}], output_tensor=output)"
)
code_lines.append(" return result")
else:
# 如果没有操作,直接 store 第一个输入
first_input = kernel["inputs"][0][0]
code_lines.append(
f" result = pl.store(tile_{first_input}, offsets=[0, 0], shapes=[{rows}, {cols}], output_tensor=output)"
)
code_lines.append(" return result")

return "\n".join(code_lines)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

_regenerate_kernel_code_with_unified_shapes 方法与 kernel_generator.py 中的 _generate_kernel_code 方法存在大量重复代码。为了提高代码的可维护性,建议重构这部分逻辑,例如将通用部分提取到一个共享的辅助函数中,以避免未来的维护困难。

Comment on lines 710 to 788
def generate_test_file(
self,
output_path: str,
test_configs: List[Dict[str, Any]],
) -> None:
"""生成完整的测试文件

Args:
output_path: 输出文件路径
test_configs: 测试配置列表,每个配置包含:
- name: 测试名称
- num_kernels: 内核数量
- mode: 组合模式
- shape: 张量形状
- num_ops_range: 操作数量范围
- tensor_init_type: 张量初始化类型(可选)
"""
# 生成文件头
header = '''"""
自动生成的多内核模糊测试用例

该文件由 MultiKernelTestGenerator 自动生成。
包含多个测试用例,每个测试用例包含多个 InCore 内核和一个 Orchestration 函数。
"""

import sys
from pathlib import Path
from typing import Any, List

import torch
import pytest

from harness.core.harness import DataType, PTOTestCase, TensorSpec


'''

# 生成所有测试用例
test_cases = []
for config in test_configs:
test_code = self.generate_test_case(
test_name=config["name"],
num_kernels=config.get("num_kernels", 3),
orchestration_mode=config.get("mode", "sequential"),
shape=config.get("shape", (128, 128)),
num_ops_range=config.get("num_ops_range", (3, 7)),
input_shapes_list=config.get("input_shapes_list"),
tensor_init_type=config.get("tensor_init_type"),
)
test_cases.append(test_code)

# 生成测试套件
test_suite = '''

class TestMultiKernelFuzzing:
"""多内核模糊测试套件"""

'''

for config in test_configs:
test_name = config["name"]
class_name = f"Test{test_name.replace('_', ' ').title().replace(' ', '')}"
test_suite += f''' def test_{test_name}(self, test_runner):
"""测试 {test_name}"""
test_case = {class_name}()
result = test_runner.run(test_case)
assert result.passed, f"测试失败: {{result.error}}"

'''

# 组合完整文件
full_content = header + "\n\n".join(test_cases) + test_suite

# 写入文件
output_file = Path(output_path)
output_file.parent.mkdir(parents=True, exist_ok=True)
output_file.write_text(full_content, encoding="utf-8")

print(f"测试文件已生成: {output_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

generate_test_file 方法似乎没有被使用。example_multi_kernel.py 脚本中自行实现了文件内容的生成和写入逻辑。如果此方法是冗余的,建议移除以保持代码整洁。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant