-
Notifications
You must be signed in to change notification settings - Fork 353
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Qualcomm AI Engine Direct - add cli tool for QNN artifacts (#4731)
Summary: - cli tool for deploying precompiled model library / context bin onto executorch runtime - refactor & mionr fixes Resolved #4731
- Loading branch information
1 parent
6cb5726
commit 4c06907
Showing
6 changed files
with
721 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# CLI Tool for Compile / Deploy Pre-Built QNN Artifacts | ||
|
||
An easy-to-use tool for generating / executing .pte program from pre-built model libraries / context binaries from Qualcomm AI Engine Direct. Tool is verified with [host environement](../../../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md#host-os). | ||
|
||
## Description | ||
|
||
This tool aims for users who want to leverage ExecuTorch runtime framework with their existent artifacts generated by QNN. It's possible for them to produce .pte program in few steps.<br/> | ||
If users are interested in well-known applications, [Qualcomm AI HUB](https://aihub.qualcomm.com/) is a great approach which provides tons of optimized state-of-the-art models ready for deploying. All of them could be downloaded in model library or context binary format. | ||
|
||
* Model libraries(.so) came from `qnn-model-lib-generator` | AI HUB, or context binaries(.bin) came from `qnn-context-binary-generator` | AI HUB, could apply tool directly with: | ||
- To produce .pte program: | ||
```bash | ||
$ python export.py compile | ||
``` | ||
- To perform inference with generated .pte program: | ||
```bash | ||
$ python export.py execute | ||
``` | ||
|
||
### Dependencies | ||
|
||
* Register for Qualcomm AI HUB. | ||
* Download the corresponding QNN SDK via shit [link](https://www.qualcomm.com/developer/software/qualcomm-ai-engine-direct-sdk) which your favorite model is compiled with. Ths link will automatically download the latest version at this moment (users should be able to specify version soon, please refer to [this](../../../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md#software) for earlier releases). | ||
|
||
### Target Model | ||
|
||
* Consider using [virtual environment](https://app.aihub.qualcomm.com/docs/hub/getting_started.html) for AI HUB scripts to prevent package conflict against ExecuTorch. Please finish the [installation section](https://app.aihub.qualcomm.com/docs/hub/getting_started.html#installation) before proceeding following steps. | ||
* Take [QuickSRNetLarge-Quantized](https://aihub.qualcomm.com/models/quicksrnetlarge_quantized?searchTerm=quantized) as an example, please [install](https://huggingface.co/qualcomm/QuickSRNetLarge-Quantized#installation) package as instructed. | ||
* Create workspace and export pre-built model library: | ||
```bash | ||
mkdir $MY_WS && cd $MY_WS | ||
# target chipset is `SM8650` | ||
python -m qai_hub_models.models.quicksrnetlarge_quantized.export --target-runtime qnn --chipset qualcomm-snapdragon-8gen3 | ||
``` | ||
* The compiled model library will be located under `$MY_WS/build/quicksrnetlarge_quantized/quicksrnetlarge_quantized.so`. This model library maps to the artifacts generated by SDK tools mentioned in `Integration workflow` section on [Qualcomm AI Engine Direct document](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/overview.html). | ||
|
||
### Compiling Program | ||
|
||
* Compile .pte program | ||
```bash | ||
# `pip install pydot` if package is missing | ||
# Note that device serial & hostname might not be required if given artifacts is in context binary format | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/qaihub_scripts/utils/export.py compile -a $MY_WS/build/quicksrnetlarge_quantized/quicksrnetlarge_quantized.so -m SM8650 -s $DEVICE_SERIAL -b $EXECUTORCH_ROOT/build-android | ||
``` | ||
* Artifacts for checking IO information | ||
- `output_pte/quicksrnetlarge_quantized/quicksrnetlarge_quantized.json` | ||
- `output_pte/quicksrnetlarge_quantized/quicksrnetlarge_quantized.svg` | ||
|
||
### Executing Program | ||
|
||
* Prepare test image | ||
```bash | ||
cd $MY_WS | ||
wget https://user-images.githubusercontent.com/12981474/40157448-eff91f06-5953-11e8-9a37-f6b5693fa03f.png -O baboon.png | ||
``` | ||
Execute following python script to generate input data: | ||
```python | ||
import torch | ||
import torchvision.transforms as transforms | ||
from PIL import Image | ||
img = Image.open('baboon.png').resize((128, 128)) | ||
transform = transforms.Compose([transforms.PILToTensor()]) | ||
# convert (C, H, W) to (N, H, W, C) | ||
# IO tensor info. could be checked with quicksrnetlarge_quantized.json | .svg | ||
img = transform(img).permute(1, 2, 0).unsqueeze(0) | ||
torch.save(img, 'baboon.pt') | ||
``` | ||
* Execute .pte program | ||
```bash | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/qaihub_scripts/utils/export.py execute -p output_pte/quicksrnetlarge_quantized -i baboon.pt -s $DEVICE_SERIAL -b $EXECUTORCH_ROOT/build-android | ||
``` | ||
* Post-process generated data | ||
```bash | ||
cd output_data | ||
``` | ||
Execute following python script to generate output image: | ||
```python | ||
import io | ||
import torch | ||
import torchvision.transforms as transforms | ||
# IO tensor info. could be checked with quicksrnetlarge_quantized.json | .svg | ||
# generally we would have same layout for input / output tensors: e.g. either NHWC or NCHW | ||
# this might not be true under different converter configurations | ||
# learn more with converter tool from Qualcomm AI Engine Direct documentation | ||
# https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/tools.html#model-conversion | ||
with open('output__142.pt', 'rb') as f: | ||
buffer = io.BytesIO(f.read()) | ||
img = torch.load(buffer, weights_only=False) | ||
transform = transforms.Compose([transforms.ToPILImage()]) | ||
img_pil = transform(img.squeeze(0)) | ||
img_pil.save('baboon_upscaled.png') | ||
``` | ||
You could check the upscaled result now! | ||
|
||
## Help | ||
|
||
Please check help messages for more information: | ||
```bash | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/export.py -h | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/python export.py compile -h | ||
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/python export.py execute -h | ||
``` |
Oops, something went wrong.