Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimization patterns fusing ops into custom ops #1636

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

xadupre
Copy link
Member

@xadupre xadupre commented Jun 19, 2024

No description provided.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Copy link

codecov bot commented Jun 19, 2024

Codecov Report

Attention: Patch coverage is 84.83607% with 111 lines in your changes missing coverage. Please review.

Project coverage is 74.53%. Comparing base (c06e7ab) to head (3b4d700).

Files Patch % Lines
...nxscript/rewriter/custom_ops/llm_rule_sets_cuda.py 80.63% 42 Missing and 25 partials ⚠️
...ipt/rewriter/custom_ops/llm_rule_sets_cuda_test.py 90.10% 29 Missing and 8 partials ⚠️
onnxscript/rewriter/generic_pattern.py 30.00% 5 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1636      +/-   ##
==========================================
- Coverage   74.88%   74.53%   -0.36%     
==========================================
  Files         242      245       +3     
  Lines       25869    26592     +723     
  Branches     4665     4827     +162     
==========================================
+ Hits        19373    19821     +448     
- Misses       5627     5851     +224     
- Partials      869      920      +51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre marked this pull request as ready for review June 21, 2024 09:12
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre self-assigned this Jun 21, 2024
xadupre and others added 5 commits June 24, 2024 10:04
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@justinchuby
Copy link
Collaborator

Helpful to create a Pr description? Thanks! https://google.github.io/eng-practices/review/developer/cl-descriptions.html

pattern_value.producer() is not None for pattern_value in pattern_node.inputs
)

if has_predecessor_in_pattern:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition meant to be an optimization (with no effect in behavior)? Trying to understand

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fix might be related to an issue Jingyan is running into. Does this fix allow pattern-inputs to have other uses outside of the pattern?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. I think we should change this slightly, by applying the condition for each pattern_value separately, as mentioned in comment below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fill free to rewrite the code. If the unit test are passing, it means it is good.

orp.make_rewrite_rule_from_class(TransposeFusedMatMul1, True),
orp.make_rewrite_rule_from_class(TransposeMatMul2, True),
orp.make_rewrite_rule_from_class(TransposeFusedMatMul2, True),
orp.make_rewrite_rule_from_class(FusedMatMulDiv1, generic=True),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand why you have this generic=True? Is it for debugging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it is using the generic pattern matcher (multi output pattern)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the pattern has multiple outputs, the implementation should automatically choose the generic matcher ... user doesn't have to specify it.


if has_predecessor_in_pattern:
for graph_input, pattern_input in zip(graph_node.inputs, pattern_node.inputs):
if len(graph_input.uses()) != len(pattern_input.uses()):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the has_predecessor_in_pattern check above, I think we should do the check here: add if pattern_input.producer() is not None here ... that will handle the case where some inputs have producer and others don't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

pattern_pred = pattern_value.producer()
if pattern_pred is not None and len(graph_value.uses()) != len(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused ... isn't this already handled above? Why this check on the number of uses again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of the loop can be executed for the first input if it satisfies the conditions and there is a case where it should not be done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fill free to find a better solution. The unit test should catch any weird behaviour.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@@ -41,6 +41,7 @@
prog = torch.onnx.export(model, args, dynamo=True) # pylint: disable=no-value-for-parameter
else:
prog = torch.onnx.dynamo_export(model, *args)

Check warning

Code scanning / lintrunner

RUFF/W293 Warning

@@ -41,6 +41,7 @@
prog = torch.onnx.export(model, args, dynamo=True) # pylint: disable=no-value-for-parameter
else:
prog = torch.onnx.dynamo_export(model, *args)

Check warning

Code scanning / lintrunner

EDITORCONFIG-CHECKER/editorconfig Warning

Trailing whitespace
gramalingam added a commit that referenced this pull request Jul 16, 2024
…ttern matcher (#1731)

The multi-output pattern does not support a useful case, where the
pattern's input variables are reused elsewhere.

This PR adds a test-case illustrating the scenario. I was hoping that
some changes in the in-progress PR #1636 by Xavier might fix this. But I
don't think it does. Adding the test-case for now. Need to figure out
how to support this.

---------

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
@justinchuby justinchuby marked this pull request as draft September 17, 2024 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants