Skip to content

[WIP] Introduce benchgc, a benchmark tool for graph compiler #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

WangJialei-A
Copy link
Contributor

@WangJialei-A WangJialei-A commented May 10, 2024

Try to resolve #58

Tasks

  • Build MLIR GC library
  • Integrate MLIR Python binding
  • Pipeline Execution
  • Upload MLIR Cases
  • Pytorch correctness check references
  • Data filling for benchmark
  • Compare threshold & strategy

@kurapov-peter
Copy link
Contributor

What is this? Why do we introduce yet another graph representation and how it is related to correctness?

@ZhennanQin
Copy link
Contributor

What is this? Why do we introduce yet another graph representation and how it is related to correctness?

It's not another graph representation. This benchmark tool can accept an .mlir file written in oneDNN graph dialect, and compile & validate it with the pytorch reference. This tool will validate the correctness in the nightly & weekly tests.

@WangJialei-A WangJialei-A added this to the Functional llama2 milestone May 11, 2024
@WangJialei-A WangJialei-A force-pushed the wangjial/benchgc branch 10 times, most recently from a638a37 to deff617 Compare May 14, 2024 05:43
@kurapov-peter
Copy link
Contributor

It's not another graph representation.

It is, there's a graph definition and all the ops. I also don't see why would we need one for compilation and validation.

@WangJialei-A
Copy link
Contributor Author

It's not another graph representation.

It is, there's a graph definition and all the ops. I also don't see why would we need one for compilation and validation.

Hi, @kurapov-peter
The graphs and operator definitions you see all originate from the existing Graph API's JSON format; we have not introduced any new definitions.

To conduct correctness tests, the testing tool needs to understand the entire graph's topology and the involved operators in order to populate appropriate data and set correct thresholds. Considering that the test cases are in MLIR format, which is quite low-level and challenging to directly discern the topology structure from, we need to translate the MLIR format into the Graph API's JSON format to facilitate the testing tool in making correct behaviors.

Considering that complex MLIR files will be exported from the framework side, and that MLIR files lack readability, BenchGC will provide a feature to convert MLIR format into Graph API JSON format. This will facilitate developers in using visualization tools like Neutron to understand the MLIR files and what kind of kernels the framework is actually running.

@kurapov-peter
Copy link
Contributor

Considering that complex MLIR files will be exported from the framework side, and that MLIR files lack readability, BenchGC will provide a feature to convert MLIR format into Graph API JSON format.

Why not use JSON as input in the first place then?

@WangJialei-A
Copy link
Contributor Author

Considering that complex MLIR files will be exported from the framework side, and that MLIR files lack readability, BenchGC will provide a feature to convert MLIR format into Graph API JSON format.

Why not use JSON as input in the first place then?

@kurapov-peter
Although it is possible to obtain a JSON-formatted graph when calling Graph Compiler through the Graph API, considering that in the future Graph Compiler might integrate with other components, such as Torch MLIR, it would then be impossible to export the graph in JSON format.

@kurapov-peter
Copy link
Contributor

@kurapov-peter Although it is possible to obtain a JSON-formatted graph when calling Graph Compiler through the Graph API, considering that in the future Graph Compiler might integrate with other components, such as Torch MLIR, it would then be impossible to export the graph in JSON format.

You wouldn't be able to use the same flow since something like torch-mlir would not map to the representation you are introducing. This is not an argument in favor of having python graph api.

@WangJialei-A
Copy link
Contributor Author

@kurapov-peter
I guess I've figured out where our point of disagreement. The current goal is only to verify the correctness of a single operator, so you might feel that introducing a Python-expressed Graph API is unnecessary. Under the condition of a single operator, many things can be simplified. However, BenchGC is not designed for a single operator. The verification of a single operator is just a special case; its future goal is to support the verification work of complex partitions containing many OPs. At that time, the corresponding MLIR test cases will also be very complex, and without the Graph API Json as a bridge, it would be difficult to translate them into PyTorch execution.

Using JSON as the input for test cases is unacceptable. Our library is an MLIR library, and the envisioned goal of BenchGC is to verify any MLIR case based on the OneDNN Graph Dialect. If JSON is used as the input and then translated into MLIR format, it would not be able to express all the possibilities that the MLIR format can represent.

@kurapov-peter
Copy link
Contributor

At that time, the corresponding MLIR test cases will also be very complex, and without the Graph API Json as a bridge, it would be difficult to translate them into PyTorch execution.

I don't think I follow. You either test PyTorch end-to-end or test a specific optimization/pattern/whatever via MLIR input (or one of the existing representations like onednn graph api or json). Introducing another representation is redundant.

Using JSON as the input for test cases is unacceptable. Our library is an MLIR library, and the envisioned goal of BenchGC is to verify any MLIR case based on the OneDNN Graph Dialect.

This path is specific to onednn, if we are to target generic cases the proposed python interface should not rely on onednn semantics (which it does now). It should also not be a part of a benchmarking tool.

@WangJialei-A
Copy link
Contributor Author

WangJialei-A commented May 28, 2024

At that time, the corresponding MLIR test cases will also be very complex, and without the Graph API Json as a bridge, it would be difficult to translate them into PyTorch execution.

I don't think I follow. You either test PyTorch end-to-end or test a specific optimization/pattern/whatever via MLIR input (or one of the existing representations like onednn graph api or json). Introducing another representation is redundant.

Using JSON as the input for test cases is unacceptable. Our library is an MLIR library, and the envisioned goal of BenchGC is to verify any MLIR case based on the OneDNN Graph Dialect.

This path is specific to onednn, if we are to target generic cases the proposed python interface should not rely on onednn semantics (which it does now). It should also not be a part of a benchmarking tool.

@kurapov-peter
I am completely confused by your reply. Can you present your plan on the validation work #58 ? This will help us better understand each other's differences.

@aregm
Copy link

aregm commented Jul 2, 2024

Why are you adding a benchgc into sources? Why do we need it vendored here and not as a 3rd party dependency?

@WangJialei-A
Copy link
Contributor Author

new implementation at #161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Correctness check for single op
4 participants