-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualcomm AI Engine Direct - add cli tool for QNN artifacts #4731
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# CLI Tool for Compile / Deploy Pre-Built QNN Artifacts | ||
|
||
An easy-to-use tool for generating / executing .pte program from pre-built model libraries / context binaries from Qualcomm AI Engine Direct. Tool is verified with [host environement](../../../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md#host-os). | ||
|
||
## Description | ||
|
||
This tool aims for users who want to leverage ExecuTorch runtime framework with their existent artifacts generated by QNN. It's possible for them to produce .pte program in few steps.<br/> | ||
If users are interested in well-known applications, [Qualcomm AI HUB](https://aihub.qualcomm.com/) is a great approach which provides tons of optimized state-of-the-art models ready for deploying. All of them could be downloaded in model library or context binary format. | ||
|
||
* Model libraries(.so) came from `qnn-model-lib-generator` | AI HUB, or context binaries(.bin) came from `qnn-context-binary-generator` | AI HUB, could apply tool directly with: | ||
- To produce .pte program: | ||
```bash | ||
$ python export.py compile | ||
``` | ||
- To perform inference with generated .pte program: | ||
```bash | ||
$ python export.py execute | ||
``` | ||
|
||
### Dependencies | ||
|
||
* Register for Qualcomm AI HUB. | ||
* Download the corresponding QNN SDK via shit [link](https://www.qualcomm.com/developer/software/qualcomm-ai-engine-direct-sdk) which your favorite model is compiled with. Ths link will automatically download the latest version at this moment (users should be able to specify version soon, please refer to [this](../../../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md#software) for earlier releases). | ||
|
||
### Target Model | ||
|
||
* Consider using [virtual environment](https://app.aihub.qualcomm.com/docs/hub/getting_started.html) for AI HUB scripts to prevent package conflict against ExecuTorch. Please finish the [installation section](https://app.aihub.qualcomm.com/docs/hub/getting_started.html#installation) before proceeding following steps. | ||
* Take [QuickSRNetLarge-Quantized](https://aihub.qualcomm.com/models/quicksrnetlarge_quantized?searchTerm=quantized) as an example, please [install](https://huggingface.co/qualcomm/QuickSRNetLarge-Quantized#installation) package as instructed. | ||
* Create workspace and export pre-built model library: | ||
```bash | ||
mkdir $MY_WS && cd $MY_WS | ||
# target chipset is `SM8650` | ||
python -m qai_hub_models.models.quicksrnetlarge_quantized.export --target-runtime qnn --chipset qualcomm-snapdragon-8gen3 | ||
``` | ||
* The compiled model library will be located under `$MY_WS/build/quicksrnetlarge_quantized/quicksrnetlarge_quantized.so`. This model library maps to the artifacts generated by SDK tools mentioned in `Integration workflow` section on [Qualcomm AI Engine Direct document](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/overview.html). | ||
|
||
### Compiling Program | ||
|
||
* Compile .pte program | ||
```bash | ||
# `pip install pydot` if package is missing | ||
# Note that device serial & hostname might not be required if given artifacts is in context binary format | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/qaihub_scripts/utils/export.py compile -a $MY_WS/build/quicksrnetlarge_quantized/quicksrnetlarge_quantized.so -m SM8650 -s $DEVICE_SERIAL -b $EXECUTORCH_ROOT/build-android | ||
``` | ||
* Artifacts for checking IO information | ||
- `output_pte/quicksrnetlarge_quantized/quicksrnetlarge_quantized.json` | ||
- `output_pte/quicksrnetlarge_quantized/quicksrnetlarge_quantized.svg` | ||
|
||
### Executing Program | ||
|
||
* Prepare test image | ||
```bash | ||
cd $MY_WS | ||
wget https://user-images.githubusercontent.com/12981474/40157448-eff91f06-5953-11e8-9a37-f6b5693fa03f.png -O baboon.png | ||
``` | ||
Execute following python script to generate input data: | ||
```python | ||
import torch | ||
import torchvision.transforms as transforms | ||
from PIL import Image | ||
img = Image.open('baboon.png').resize((128, 128)) | ||
transform = transforms.Compose([transforms.PILToTensor()]) | ||
# convert (C, H, W) to (N, H, W, C) | ||
# IO tensor info. could be checked with quicksrnetlarge_quantized.json | .svg | ||
img = transform(img).permute(1, 2, 0).unsqueeze(0) | ||
torch.save(img, 'baboon.pt') | ||
``` | ||
* Execute .pte program | ||
```bash | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/qaihub_scripts/utils/export.py execute -p output_pte/quicksrnetlarge_quantized -i baboon.pt -s $DEVICE_SERIAL -b $EXECUTORCH_ROOT/build-android | ||
``` | ||
* Post-process generated data | ||
```bash | ||
cd output_data | ||
``` | ||
Execute following python script to generate output image: | ||
```python | ||
import io | ||
import torch | ||
import torchvision.transforms as transforms | ||
# IO tensor info. could be checked with quicksrnetlarge_quantized.json | .svg | ||
# generally we would have same layout for input / output tensors: e.g. either NHWC or NCHW | ||
# this might not be true under different converter configurations | ||
# learn more with converter tool from Qualcomm AI Engine Direct documentation | ||
# https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/tools.html#model-conversion | ||
with open('output__142.pt', 'rb') as f: | ||
buffer = io.BytesIO(f.read()) | ||
img = torch.load(buffer, weights_only=False) | ||
transform = transforms.Compose([transforms.ToPILImage()]) | ||
img_pil = transform(img.squeeze(0)) | ||
img_pil.save('baboon_upscaled.png') | ||
``` | ||
You could check the upscaled result now! | ||
|
||
## Help | ||
|
||
Please check help messages for more information: | ||
```bash | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/export.py -h | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/python export.py compile -h | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/python export.py execute -h | ||
``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it generic for all models from ai hub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, artifacts from AIHUB related to QNN are delivered with
.so
format. Only large generative AI models are shipped with context binaries.Both of them could be transformed into
.pte
program with this tool.