-
Notifications
You must be signed in to change notification settings - Fork 20
[WIP] test: Fuzz #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: flow
Are you sure you want to change the base?
[WIP] test: Fuzz #182
Conversation
Summary of ChangesHello @doraemonmj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求引入了一个全面的多内核模糊测试框架,旨在自动化PyPTO程序的生成和验证。该框架能够创建包含随机操作组合的InCore内核,并通过顺序、分支或混合模式进行编排。通过配置驱动的测试生成和PyTorch作为黄金参考实现,它显著提升了PyPTO测试的覆盖范围和可靠性。 Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
这个 PR 引入了一个功能全面的多内核模糊测试框架,为项目的测试能力带来了很好的补充。代码结构清晰,将不同的生成器类进行了划分。我发现了一些问题,包括一个关键的文件名拼写错误、一个合并内核形状处理的 bug,以及一些可以清理的死代码和不一致之处。修复这些问题后,这个框架将更加健壮和易于维护。
| if str(_TESTS_ST_DIR) not in sys.path: | ||
| sys.path.insert(0, str(_TESTS_ST_DIR)) | ||
|
|
||
| from fuzz.src.mutile_kernel_test_generator import MultiKernelTestGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| from .fuzzer import OpFuzzer, OpSpec | ||
| from .kernel_generator import KernelGenerator | ||
| from .mutile_kernel_test_generator import MultiKernelTestGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,788 @@ | |||
| """ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| # 添加合并内核(如果需要) | ||
| if orch_info.get("needs_merge_kernel", False): | ||
| merge_code = self.orch_gen.generate_merge_kernel(shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在 branching 或 mixed 模式下,当需要合并内核时,generate_merge_kernel 方法被调用时传入了默认的 shape 参数,而不是 orch_info 中计算出的、统一后的 output_shape。这可能导致在合并内核(merge kernel)中因形状不匹配而产生错误。orch_info 字典中已经包含了正确的输出形状,应使用它来生成合并内核。
| merge_code = self.orch_gen.generate_merge_kernel(shape) | |
| merge_code = self.orch_gen.generate_merge_kernel(orch_info["output_shape"]) |
| │ ├── fuzzer.py # OpFuzzer 核心逻辑 | ||
| │ ├── kernel_generator.py # InCore 内核生成器 | ||
| │ ├── orchestrator_generator.py # Orchestration 组合函数生成器 | ||
| │ └── mutile_kernel_test_generator.py # 完整测试用例生成器 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "input_shapes_list": [ | ||
| [(128, 128), (128, 128)], # kernel_0: 2个相同维度的输入 | ||
| ], | ||
| "description": "简单顺序执行:2个内核,相同维度输入", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def generate_numpy_reference( | ||
| self, | ||
| op_chain: List[Dict[str, Any]], | ||
| input_tensors: Dict[str, Any], | ||
| ) -> Any: | ||
| """Generate NumPy golden reference from operation chain.""" | ||
| import numpy as np | ||
|
|
||
| # Create variable environment | ||
| env = {} | ||
| for name, tensor in input_tensors.items(): | ||
| env[f"tile_{name}"] = tensor.copy() | ||
|
|
||
| # Execute operations | ||
| for op_dict in op_chain: | ||
| op = op_dict["op"] | ||
| inputs = op_dict["inputs"] | ||
| output = op_dict["output"] | ||
| params = op_dict.get("params") | ||
|
|
||
| # Get input values | ||
| input_vals = [] | ||
| for inp in inputs: | ||
| if inp in env: | ||
| val = env[inp] | ||
| else: | ||
| val = float(inp) | ||
| input_vals.append(val) | ||
|
|
||
| # Apply constraints | ||
| if "avoid_zero" in op.constraints and op.constraints["avoid_zero"]: | ||
| for i, val in enumerate(input_vals): | ||
| if isinstance(val, np.ndarray): | ||
| input_vals[i] = np.where(np.abs(val) < 0.01, 1.0, val) | ||
|
|
||
| if "positive_only" in op.constraints and op.constraints["positive_only"]: | ||
| for i, val in enumerate(input_vals): | ||
| if isinstance(val, np.ndarray): | ||
| input_vals[i] = np.abs(val) + 1e-6 | ||
|
|
||
| # Execute operation | ||
| if op.np_equivalent: | ||
| import inspect | ||
|
|
||
| sig = inspect.signature(op.np_equivalent) | ||
| if params and len(sig.parameters) > len(input_vals): | ||
| result = op.np_equivalent(*input_vals, params) | ||
| else: | ||
| result = op.np_equivalent(*input_vals) | ||
| env[output] = result | ||
|
|
||
| # Return final result | ||
| if op_chain: | ||
| return env[op_chain[-1]["output"]] | ||
| else: | ||
| return input_tensors[list(input_tensors.keys())[0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 该模块负责生成完整的测试用例,包括: | ||
| - 多个 InCore 内核 | ||
| - Orchestration 组合函数 | ||
| - NumPy 参考实现 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _regenerate_kernel_code_with_unified_shapes( | ||
| self, | ||
| kernel: Dict[str, Any], | ||
| input_shapes_map: Dict[str, Tuple[int, int]], | ||
| ) -> str: | ||
| """使用统一的输入形状重新生成 kernel 代码 | ||
|
|
||
| Args: | ||
| kernel: 内核信息字典 | ||
| input_shapes_map: 统一的输入形状映射 | ||
|
|
||
| Returns: | ||
| 重新生成的 kernel 代码 | ||
| """ | ||
| kernel_name = kernel["name"] | ||
| output_shape = kernel["output_shape"] | ||
| op_chain = kernel["op_chain"] | ||
| rows, cols = output_shape | ||
|
|
||
| # 使用统一的输入形状生成函数签名 | ||
| params = [] | ||
| for inp_name, _ in kernel["inputs"]: | ||
| unified_shape = input_shapes_map[inp_name] | ||
| params.append(f"{inp_name}: pl.Tensor[[{unified_shape[0]}, {unified_shape[1]}], pl.FP32]") | ||
| # 添加 output_tensor 参数 | ||
| params.append(f"output: pl.Tensor[[{rows}, {cols}], pl.FP32]") | ||
|
|
||
| code_lines = [ | ||
| " @pl.function(type=pl.FunctionType.InCore)", | ||
| f" def {kernel_name}(self, {', '.join(params)}) -> pl.Tensor[[{rows}, {cols}], pl.FP32]:", | ||
| ] | ||
|
|
||
| # 加载输入张量 - 使用每个输入的实际定义形状 | ||
| for inp_name, _ in kernel["inputs"]: | ||
| inp_shape = input_shapes_map[inp_name] | ||
| code_lines.append( | ||
| f" tile_{inp_name} = pl.load({inp_name}, offsets=[0, 0], shapes=[{inp_shape[0]}, {inp_shape[1]}])" | ||
| ) | ||
|
|
||
| # 生成操作链 | ||
| for op_dict in op_chain: | ||
| op = op_dict["op"] | ||
| inputs_str = ", ".join(op_dict["inputs"]) | ||
| output = op_dict["output"] | ||
| params_dict = op_dict.get("params") | ||
|
|
||
| # 去掉 block. 前缀,直接使用 pl.xxx | ||
| op_name = op.name.replace("block.", "") | ||
|
|
||
| if params_dict: | ||
| params_str = ", ".join(f"{k}={v}" for k, v in params_dict.items()) | ||
| code_lines.append(f" {output} = pl.{op_name}({inputs_str}, {params_str})") | ||
| else: | ||
| code_lines.append(f" {output} = pl.{op_name}({inputs_str})") | ||
|
|
||
| # Store 结果并返回 | ||
| if op_chain: | ||
| last_output = op_chain[-1]["output"] | ||
| code_lines.append( | ||
| f" result = pl.store({last_output}, offsets=[0, 0], shapes=[{rows}, {cols}], output_tensor=output)" | ||
| ) | ||
| code_lines.append(" return result") | ||
| else: | ||
| # 如果没有操作,直接 store 第一个输入 | ||
| first_input = kernel["inputs"][0][0] | ||
| code_lines.append( | ||
| f" result = pl.store(tile_{first_input}, offsets=[0, 0], shapes=[{rows}, {cols}], output_tensor=output)" | ||
| ) | ||
| code_lines.append(" return result") | ||
|
|
||
| return "\n".join(code_lines) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def generate_test_file( | ||
| self, | ||
| output_path: str, | ||
| test_configs: List[Dict[str, Any]], | ||
| ) -> None: | ||
| """生成完整的测试文件 | ||
|
|
||
| Args: | ||
| output_path: 输出文件路径 | ||
| test_configs: 测试配置列表,每个配置包含: | ||
| - name: 测试名称 | ||
| - num_kernels: 内核数量 | ||
| - mode: 组合模式 | ||
| - shape: 张量形状 | ||
| - num_ops_range: 操作数量范围 | ||
| - tensor_init_type: 张量初始化类型(可选) | ||
| """ | ||
| # 生成文件头 | ||
| header = '''""" | ||
| 自动生成的多内核模糊测试用例 | ||
|
|
||
| 该文件由 MultiKernelTestGenerator 自动生成。 | ||
| 包含多个测试用例,每个测试用例包含多个 InCore 内核和一个 Orchestration 函数。 | ||
| """ | ||
|
|
||
| import sys | ||
| from pathlib import Path | ||
| from typing import Any, List | ||
|
|
||
| import torch | ||
| import pytest | ||
|
|
||
| from harness.core.harness import DataType, PTOTestCase, TensorSpec | ||
|
|
||
|
|
||
| ''' | ||
|
|
||
| # 生成所有测试用例 | ||
| test_cases = [] | ||
| for config in test_configs: | ||
| test_code = self.generate_test_case( | ||
| test_name=config["name"], | ||
| num_kernels=config.get("num_kernels", 3), | ||
| orchestration_mode=config.get("mode", "sequential"), | ||
| shape=config.get("shape", (128, 128)), | ||
| num_ops_range=config.get("num_ops_range", (3, 7)), | ||
| input_shapes_list=config.get("input_shapes_list"), | ||
| tensor_init_type=config.get("tensor_init_type"), | ||
| ) | ||
| test_cases.append(test_code) | ||
|
|
||
| # 生成测试套件 | ||
| test_suite = ''' | ||
|
|
||
| class TestMultiKernelFuzzing: | ||
| """多内核模糊测试套件""" | ||
|
|
||
| ''' | ||
|
|
||
| for config in test_configs: | ||
| test_name = config["name"] | ||
| class_name = f"Test{test_name.replace('_', ' ').title().replace(' ', '')}" | ||
| test_suite += f''' def test_{test_name}(self, test_runner): | ||
| """测试 {test_name}""" | ||
| test_case = {class_name}() | ||
| result = test_runner.run(test_case) | ||
| assert result.passed, f"测试失败: {{result.error}}" | ||
|
|
||
| ''' | ||
|
|
||
| # 组合完整文件 | ||
| full_content = header + "\n\n".join(test_cases) + test_suite | ||
|
|
||
| # 写入文件 | ||
| output_file = Path(output_path) | ||
| output_file.parent.mkdir(parents=True, exist_ok=True) | ||
| output_file.write_text(full_content, encoding="utf-8") | ||
|
|
||
| print(f"测试文件已生成: {output_path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Multi-Kernel Fuzzing Framework for PyPTO
Summary
Adds an automated fuzzing framework for generating and testing multi-kernel PyPTO programs with random operator combinations.
What's New
Fuzzing Framework (
tests/st/fuzz/)OpFuzzer: Generates random operation chains with 15+ operators (add, mul, log, exp, etc.)KernelGenerator: Creates InCore kernel functions with configurable inputsOrchestratorGenerator: Supports 3 composition modes (sequential, branching, mixed)MultiKernelTestGenerator: Generates complete test cases with PyTorch golden referenceConfiguration-Based Generation
example_multi_kernel.pyEnhanced Error Reporting
Usage
0213
本 PR 在 tests/st/fuzz/ 目录下引入一个独立的模糊测试框架,用于自动生成和验证由多个 InCore 内核组成的 PyPTO 程序。
主要功能:
随机生成符合硬件约束(32 字节对齐、单一管道类型)的内核函数;
支持 sequential、branching 和 mixed 三种内核组合模式;
自动生成 PyTorch 参考实现,并支持配置 atol/rtol 容差;
所有逻辑自包含,不依赖外部 fuzz 工具。
框架通过 example_multi_kernel.py 配置测试用例,生成的代码位于 generated/ 目录,可直接由 pytest 执行。