-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ONNX Runtime optimizations in exporters.onnx #807
Merged
fxmarty
merged 24 commits into
huggingface:main
from
fxmarty:support-ort-optimizations-in-exporters
Feb 24, 2023
Merged
Changes from 21 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
67e0c75
test support
fxmarty 5dc2b1d
fix decoder merge to work with ORT optimizer
fxmarty 25bd3fe
add tests
fxmarty 4546ad7
fix test
fxmarty 378f3ae
fix dockerfile
fxmarty 286bf09
fix
fxmarty 8d6e805
fix doc
fxmarty 60c5768
fix
fxmarty 3bcde84
properly raise an error at validation
fxmarty d151e9a
raise error
fxmarty 7852d31
fix
fxmarty dd87386
fix some test
fxmarty 62f0958
fix bloom validation
fxmarty ade10e9
Merge branch 'master' into support-ort-optimizations-in-exporters
fxmarty 79d86f3
fix broken initializer deduplication in case of name collision
fxmarty 4fd433b
Update optimum/onnx/graph_transformations.py
fxmarty 986e378
Update tests/exporters/onnx/test_exporters_onnx_cli.py
fxmarty b19612c
remove print
fxmarty 50c0a76
fix on suggestions
fxmarty d7f5b27
fix remaining tests
fxmarty f3ef447
Update docs/source/onnxruntime/usage_guides/optimization.mdx
fxmarty a1da572
Update optimum/onnx/graph_transformations.py
fxmarty 966df67
Update optimum/exporters/onnx/convert.py
fxmarty 7f0006e
Update optimum/onnx/graph_transformations.py
fxmarty File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
name: Exporters / Test GPU | ||
|
||
on: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: 0 1 */3 * * # at 1am every 3 days | ||
pull_request: | ||
types: [labeled] | ||
# uncomment to enable on PR merge on main branch: | ||
#push: | ||
# branches: | ||
# - main | ||
|
||
jobs: | ||
start-runner: | ||
name: Start self-hosted EC2 runner | ||
runs-on: ubuntu-latest | ||
env: | ||
AWS_REGION: us-east-1 | ||
EC2_AMI_ID: ami-0dc1c26161f869ed1 | ||
EC2_INSTANCE_TYPE: g4dn.xlarge | ||
EC2_SUBNET_ID: subnet-859322b4,subnet-b7533b96,subnet-47cfad21,subnet-a396b2ad,subnet-06576a4b,subnet-df0f6180 | ||
EC2_SECURITY_GROUP: sg-0bb210cd3ec725a13 | ||
EC2_IAM_ROLE: optimum-ec2-github-actions-role | ||
outputs: | ||
label: ${{ steps.start-ec2-runner.outputs.label }} | ||
ec2-instance-id: ${{ steps.start-ec2-runner.outputs.ec2-instance-id }} | ||
steps: | ||
- name: Configure AWS credentials | ||
uses: aws-actions/configure-aws-credentials@v1 | ||
with: | ||
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} | ||
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
aws-region: ${{ env.AWS_REGION }} | ||
- name: Start EC2 runner | ||
id: start-ec2-runner | ||
uses: philschmid/philschmid-ec2-github-runner@main | ||
with: | ||
mode: start | ||
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }} | ||
ec2-image-id: ${{ env.EC2_AMI_ID }} | ||
ec2-instance-type: ${{ env.EC2_INSTANCE_TYPE }} | ||
subnet-id: ${{ env.EC2_SUBNET_ID }} | ||
security-group-id: ${{ env.EC2_SECURITY_GROUP }} | ||
iam-role-name: ${{ env.EC2_IAM_ROLE }} | ||
aws-resource-tags: > # optional, requires additional permissions | ||
[ | ||
{"Key": "Name", "Value": "ec2-optimum-github-runner"}, | ||
{"Key": "GitHubRepository", "Value": "${{ github.repository }}"} | ||
] | ||
do-the-job: | ||
name: Setup | ||
needs: start-runner # required to start the main job when the runner is ready | ||
runs-on: ${{ needs.start-runner.outputs.label }} # run the job on the newly created runner | ||
env: | ||
AWS_REGION: us-east-1 | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v2 | ||
- name: Build image | ||
run: | | ||
docker build -f tests/exporters/Dockerfile_exporters_gpu -t exporters-gpu . | ||
- name: Test with unittest within docker container | ||
run: | | ||
docker run --rm --gpus all -v $(pwd)/hf_cache:/root/.cache/huggingface --workdir=/workspace/optimum/tests exporters-gpu:latest | ||
|
||
stop-runner: | ||
name: Stop self-hosted EC2 runner | ||
needs: | ||
- start-runner # required to get output from the start-runner job | ||
- do-the-job # required to wait when the main job is done | ||
runs-on: ubuntu-latest | ||
env: | ||
AWS_REGION: us-east-1 | ||
if: ${{ always() }} # required to stop the runner even if the error happened in the previous jobs | ||
steps: | ||
- name: Configure AWS credentials | ||
uses: aws-actions/configure-aws-credentials@v1 | ||
with: | ||
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} | ||
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
aws-region: ${{ env.AWS_REGION }} | ||
- name: Stop EC2 runner | ||
uses: philschmid/philschmid-ec2-github-runner@main | ||
with: | ||
mode: stop | ||
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }} | ||
label: ${{ needs.start-runner.outputs.label }} | ||
ec2-instance-id: ${{ needs.start-runner.outputs.ec2-instance-id }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -100,6 +100,19 @@ def parse_args_onnx(parser): | |
" and decoder-with-past models into a single ONNX model file to reduce memory usage." | ||
), | ||
) | ||
optional_group.add_argument( | ||
"--optimize", | ||
type=str, | ||
default=None, | ||
choices=["O1", "O2", "O3", "O4"], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So no possibility of providing an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will in a next PR. |
||
help=( | ||
"Allows to run ONNX Runtime optimizations directly during the export. Some of these optimizations are specific to ONNX Runtime, and the resulting ONNX will not be usable with other runtime as OpenVINO or TensorRT. Possible options:\n" | ||
" - O1: Basic general optimizations\n" | ||
" - O2: Basic and extended general optimizations, transformers-specific fusions\n" | ||
" - O3: Same as O2 with GELU approximation\n" | ||
" - O4: Same as O3 with mixed precision (fp16, GPU-only, requires `--device cuda`)" | ||
fxmarty marked this conversation as resolved.
Show resolved
Hide resolved
|
||
), | ||
) | ||
|
||
input_group = parser.add_argument_group( | ||
"Input shapes (if necessary, this allows to override the shapes of the input given to the ONNX exporter, that requires an example input.)" | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to have both
RUN_SLOW
andrun_slow
?I guess that
run_slow
is enough (meaning that we would not mark tests with@slow
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to keep both, otherwise we would have to pass
-m "not run_slow"
in the other tests, which is a bit painful.Having the mark allows to run only slow tests when running slow workflows (no need to run the others since they are already run on each commit).