Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ONNX Runtime optimizations in exporters.onnx #807

Merged

Conversation

fxmarty
Copy link
Contributor

@fxmarty fxmarty commented Feb 22, 2023

Add a --ort-optimize Ox flag to perform ONNX Runtime optimizations through ORTOptimizer directly in the export. This is especially required since ORT optimizations can not be applied on ONNX with subgraphs.

Left to do:

  • Tests
  • Doc
  • Fix bug with gptj/bloom merge, not sure why it was not yet spotted

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 22, 2023

The documentation is not available anymore as the PR was closed or merged.

@JingyaHuang JingyaHuang added gpu-test trigger GPU tests and removed gpu-test trigger GPU tests labels Feb 23, 2023
@@ -30,11 +30,11 @@ jobs:
- name: Test with unittest
working-directory: tests
run: |
RUN_SLOW=1 pytest exporters -s -m "not tensorflow_test" --durations=0
RUN_SLOW=1 pytest exporters -s -m "not tensorflow_test and run_slow" --durations=0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have both RUN_SLOW and run_slow?
I guess that run_slow is enough (meaning that we would not mark tests with @slow)

Copy link
Contributor Author

@fxmarty fxmarty Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to keep both, otherwise we would have to pass -m "not run_slow" in the other tests, which is a bit painful.

Having the mark allows to run only slow tests when running slow workflows (no need to run the others since they are already run on each commit).

docs/source/exporters/onnx/usage_guides/export_a_model.mdx Outdated Show resolved Hide resolved
optimum/commands/export/onnx.py Show resolved Hide resolved
@@ -65,6 +65,7 @@ class TextDecoderOnnxConfig(OnnxConfigWithPast):

PAD_ATTENTION_MASK_TO_PAST = True
DUMMY_INPUT_GENERATOR_CLASSES = (DummyTextInputGenerator, DummyPastKeyValuesGenerator)
DUMMY_PKV_GENERATOR_CLASS = DummyPastKeyValuesGenerator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Bloom uses its own custom past key values generator, and accessing through self.DUMMY_INPUT_GENERATOR_CLASSES[1] in the parent class grabs the wrong input generator for bloom. WDYT? Is it ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I see it is in TextDecoderOnnxConfig, which is nice. The whole idea with this 3 level hierarchy is to also keep flexibilty. This makes sense here IMO.

optimum/onnx/graph_transformations.py Outdated Show resolved Hide resolved
tests/exporters/onnx/test_exporters_onnx_cli.py Outdated Show resolved Hide resolved
@fxmarty
Copy link
Contributor Author

fxmarty commented Feb 23, 2023

All slow tests (both CPU and GPU) pass for this PR. Let me know if it is good to merge or if you'd like me to change anything.

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
"--optimize",
type=str,
default=None,
choices=["O1", "O2", "O3", "O4"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So no possibility of providing an ORTConfig?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will in a next PR.

optimum/exporters/onnx/convert.py Outdated Show resolved Hide resolved
optimum/onnx/graph_transformations.py Outdated Show resolved Hide resolved
optimum/onnx/graph_transformations.py Outdated Show resolved Hide resolved
fxmarty and others added 3 commits February 24, 2023 10:35
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants