-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ONNX Runtime optimizations in exporters.onnx #807
Support ONNX Runtime optimizations in exporters.onnx #807
Conversation
The documentation is not available anymore as the PR was closed or merged. |
@@ -30,11 +30,11 @@ jobs: | |||
- name: Test with unittest | |||
working-directory: tests | |||
run: | | |||
RUN_SLOW=1 pytest exporters -s -m "not tensorflow_test" --durations=0 | |||
RUN_SLOW=1 pytest exporters -s -m "not tensorflow_test and run_slow" --durations=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to have both RUN_SLOW
and run_slow
?
I guess that run_slow
is enough (meaning that we would not mark tests with @slow
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to keep both, otherwise we would have to pass -m "not run_slow"
in the other tests, which is a bit painful.
Having the mark allows to run only slow tests when running slow workflows (no need to run the others since they are already run on each commit).
@@ -65,6 +65,7 @@ class TextDecoderOnnxConfig(OnnxConfigWithPast): | |||
|
|||
PAD_ATTENTION_MASK_TO_PAST = True | |||
DUMMY_INPUT_GENERATOR_CLASSES = (DummyTextInputGenerator, DummyPastKeyValuesGenerator) | |||
DUMMY_PKV_GENERATOR_CLASS = DummyPastKeyValuesGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because Bloom uses its own custom past key values generator, and accessing through self.DUMMY_INPUT_GENERATOR_CLASSES[1]
in the parent class grabs the wrong input generator for bloom. WDYT? Is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so I see it is in TextDecoderOnnxConfig
, which is nice. The whole idea with this 3 level hierarchy is to also keep flexibilty. This makes sense here IMO.
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
All slow tests (both CPU and GPU) pass for this PR. Let me know if it is good to merge or if you'd like me to change anything. |
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
"--optimize", | ||
type=str, | ||
default=None, | ||
choices=["O1", "O2", "O3", "O4"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So no possibility of providing an ORTConfig
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will in a next PR.
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Add a
--ort-optimize Ox
flag to perform ONNX Runtime optimizations throughORTOptimizer
directly in the export. This is especially required since ORT optimizations can not be applied on ONNX with subgraphs.Left to do: