Support ONNX Runtime optimizations in exporters.onnx #807

fxmarty · 2023-02-22T13:31:28Z

Add a --ort-optimize Ox flag to perform ONNX Runtime optimizations through ORTOptimizer directly in the export. This is especially required since ORT optimizations can not be applied on ONNX with subgraphs.

Left to do:

Tests
Doc
Fix bug with gptj/bloom merge, not sure why it was not yet spotted

HuggingFaceDocBuilderDev · 2023-02-22T13:52:07Z

The documentation is not available anymore as the PR was closed or merged.

michaelbenayoun · 2023-02-23T09:59:51Z

.github/workflows/test_exporters_slow.yml

@@ -30,11 +30,11 @@ jobs:
    - name: Test with unittest
      working-directory: tests
      run: |
-        RUN_SLOW=1 pytest exporters -s -m "not tensorflow_test" --durations=0
+        RUN_SLOW=1 pytest exporters -s -m "not tensorflow_test and run_slow" --durations=0


Do we need to have both RUN_SLOW and run_slow?
I guess that run_slow is enough (meaning that we would not mark tests with @slow)

I think it is better to keep both, otherwise we would have to pass -m "not run_slow" in the other tests, which is a bit painful.

Having the mark allows to run only slow tests when running slow workflows (no need to run the others since they are already run on each commit).

docs/source/exporters/onnx/usage_guides/export_a_model.mdx

optimum/commands/export/onnx.py

michaelbenayoun · 2023-02-23T10:06:13Z

optimum/exporters/onnx/config.py

@@ -65,6 +65,7 @@ class TextDecoderOnnxConfig(OnnxConfigWithPast):

    PAD_ATTENTION_MASK_TO_PAST = True
    DUMMY_INPUT_GENERATOR_CLASSES = (DummyTextInputGenerator, DummyPastKeyValuesGenerator)
+    DUMMY_PKV_GENERATOR_CLASS = DummyPastKeyValuesGenerator


Because Bloom uses its own custom past key values generator, and accessing through self.DUMMY_INPUT_GENERATOR_CLASSES[1] in the parent class grabs the wrong input generator for bloom. WDYT? Is it ok?

Yes, so I see it is in TextDecoderOnnxConfig, which is nice. The whole idea with this 3 level hierarchy is to also keep flexibilty. This makes sense here IMO.

optimum/onnx/graph_transformations.py

tests/exporters/onnx/test_exporters_onnx_cli.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

fxmarty · 2023-02-23T17:18:31Z

All slow tests (both CPU and GPU) pass for this PR. Let me know if it is good to merge or if you'd like me to change anything.

docs/source/onnxruntime/usage_guides/optimization.mdx

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

michaelbenayoun · 2023-02-24T09:25:32Z

optimum/commands/export/onnx.py

+        "--optimize",
+        type=str,
+        default=None,
+        choices=["O1", "O2", "O3", "O4"],


So no possibility of providing an ORTConfig?

I will in a next PR.

optimum/exporters/onnx/convert.py

optimum/onnx/graph_transformations.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

fxmarty added 2 commits February 21, 2023 16:12

test support

67e0c75

fix decoder merge to work with ORT optimizer

5dc2b1d

fxmarty added 3 commits February 22, 2023 15:11

add tests

25bd3fe

fix test

4546ad7

fix dockerfile

378f3ae

fxmarty requested review from regisss, JingyaHuang, michaelbenayoun, echarlaix and mht-sharma February 22, 2023 15:14

fxmarty added 9 commits February 22, 2023 16:25

fix

286bf09

fix doc

8d6e805

fix

60c5768

properly raise an error at validation

3bcde84

raise error

d151e9a

fix

7852d31

fix some test

dd87386

fix bloom validation

62f0958

Merge branch 'master' into support-ort-optimizations-in-exporters

ade10e9

JingyaHuang added gpu-test trigger GPU tests and removed gpu-test trigger GPU tests labels Feb 23, 2023

michaelbenayoun reviewed Feb 23, 2023

View reviewed changes

fxmarty and others added 5 commits February 23, 2023 12:56

fix broken initializer deduplication in case of name collision

79d86f3

Update optimum/onnx/graph_transformations.py

4fd433b

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update tests/exporters/onnx/test_exporters_onnx_cli.py

986e378

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

remove print

b19612c

fix on suggestions

50c0a76

This was referenced Feb 23, 2023

Merged ONNX decoder next steps #784

Open

Validating ONNX model fails for GPT-J #607

Closed

fix remaining tests

d7f5b27

fxmarty requested a review from michaelbenayoun February 23, 2023 16:29

regisss reviewed Feb 23, 2023

View reviewed changes

docs/source/onnxruntime/usage_guides/optimization.mdx Outdated Show resolved Hide resolved

Update docs/source/onnxruntime/usage_guides/optimization.mdx

f3ef447

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

michaelbenayoun approved these changes Feb 24, 2023

View reviewed changes

fxmarty and others added 3 commits February 24, 2023 10:35

Update optimum/onnx/graph_transformations.py

a1da572

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update optimum/exporters/onnx/convert.py

966df67

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update optimum/onnx/graph_transformations.py

7f0006e

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

fxmarty merged commit 3591786 into huggingface:main Feb 24, 2023

fxmarty mentioned this pull request Feb 27, 2023

Add optimization and quantization options to optimum.exporters.onnx #566

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ONNX Runtime optimizations in exporters.onnx #807

Support ONNX Runtime optimizations in exporters.onnx #807

fxmarty commented Feb 22, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 22, 2023 •

edited

Loading

michaelbenayoun Feb 23, 2023

fxmarty Feb 23, 2023 •

edited

Loading

michaelbenayoun Feb 23, 2023

fxmarty Feb 23, 2023

michaelbenayoun Feb 24, 2023

fxmarty commented Feb 23, 2023 •

edited

Loading

michaelbenayoun Feb 24, 2023

fxmarty Feb 24, 2023

Support ONNX Runtime optimizations in exporters.onnx #807

Support ONNX Runtime optimizations in exporters.onnx #807

Conversation

fxmarty commented Feb 22, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Feb 22, 2023 • edited Loading

michaelbenayoun Feb 23, 2023

Choose a reason for hiding this comment

fxmarty Feb 23, 2023 • edited Loading

Choose a reason for hiding this comment

michaelbenayoun Feb 23, 2023

Choose a reason for hiding this comment

fxmarty Feb 23, 2023

Choose a reason for hiding this comment

michaelbenayoun Feb 24, 2023

Choose a reason for hiding this comment

fxmarty commented Feb 23, 2023 • edited Loading

michaelbenayoun Feb 24, 2023

Choose a reason for hiding this comment

fxmarty Feb 24, 2023

Choose a reason for hiding this comment

fxmarty commented Feb 22, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 22, 2023 •

edited

Loading

fxmarty Feb 23, 2023 •

edited

Loading

fxmarty commented Feb 23, 2023 •

edited

Loading