Skip to content

Commit

Permalink
Make it clear that need to wait for coming release if user want to us…
Browse files Browse the repository at this point in the history
…e Nuget package (microsoft#326)
  • Loading branch information
HectorSVC authored Nov 8, 2023
1 parent b6dd747 commit b2be687
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 14 deletions.
20 changes: 11 additions & 9 deletions c_cxx/QNN_EP/mobilenetv2_classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
- The sample uses the QNN EP to:
- a. run the float32 model on Qnn CPU banckend.
- b. run the QDQ model on HTP backend with qnn_context_cache_enable=1, and generates the Onnx model which has QNN context binary embedded.
a Model inputs & outputs will be float32
- Model inputs & outputs will be float32
- c. run the QNN context binary model generated from ONNX Runtime (previous step) on HTP backend, to improve the model initialization time and reduce memory overhead.
- d. run the QNN context binary model generated from QNN tool chain on HTP backend, to support models generated from native QNN tool chain.
a Models inputs & outputs will be quantized INT8 and kitten_input_nhwc.raw input per the qnn-onnx-converter.exe options used
a E.g. qnn-onnx-converter.exe --input_dtype uint8 --input_layout NHWC
a See QNN doc - docs/QNN/general/tools.html#qnn-onnx-converter
- Models inputs & outputs will be quantized INT8 and kitten_input_nhwc.raw input per the qnn-onnx-converter.exe options used
- E.g. qnn-onnx-converter.exe --input_dtype uint8 --input_layout NHWC
- See QNN doc - [docs/QNN/general/tools.html#qnn-onnx-converter](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/tools.html#qnn-onnx-converter)
- The sample downloads the mobilenetv2 model from Onnx model zoo, and use mobilenetv2_helper.py to quantize the float32 model to QDQ model which is required for HTP backend
- The sample is targeted to run on QC ARM64 device.
- There are 2 ways to improve the session creation time by using of QNN context binary:
Expand Down Expand Up @@ -46,13 +46,15 @@
- Visual Studio 2022
- Python (needed to quantize model)
- Qualcomm AI Engine Direct SDK (QNN SDK) from https://qpm.qualcomm.com/main/tools/details/qualcomm_ai_engine_direct
- Last known working QNN version: 2.14.1.230828
- OnnxRuntime ARM Build with QNN support such as ONNX Runtime (ORT) Microsoft.ML.OnnxRuntime.QNN 1.17+
- Last known working QNN version (by building ORT from source): 2.14.1, 2.15.0, 2.15.1, 2.15.3
- OnnxRuntime ARM Build with QNN support such as ONNX Runtime (ORT) Microsoft.ML.OnnxRuntime.QNN 1.17+
- Download from https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.QNN and unzip
- Please wait for ONNX Runtime 1.17+ if you want to use ORT Nuget package. This example requires a latest change in main branch [PR 17757](https://github.com/microsoft/onnxruntime/pull/17757).
- ORT Drop DOES NOT INCLUDE QNN so QNN binaries must be copied from QC SDK. E.g
- robocopy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\aarch64-windows-msvc %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
- copy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\hexagon-v68\unsigned\libQnnHtpV68Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
- copy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\hexagon-v73\unsigned\libQnnHtpV73Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
- robocopy C:\Qualcomm\AIStack\QNN\2.xx.x\lib\aarch64-windows-msvc %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
- copy C:\Qualcomm\AIStack\QNN\2.xx.x\lib\hexagon-v68\unsigned\libQnnHtpV68Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
- copy C:\Qualcomm\AIStack\QNN\2.xx.x\lib\hexagon-v73\unsigned\libQnnHtpV73Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
- Note: QNN 2.14.1, 2.15.1, 2.15.3 are preview release. The v73 libraries are available for preview release only. Please skip it if you don't have.
- (OR) Compiled from onnxruntime source with QNN support - https://onnxruntime.ai/docs/build/eps.html#qnn

## How to run the application
Expand Down
14 changes: 9 additions & 5 deletions c_cxx/QNN_EP/mobilenetv2_classification/run_qnn_ep_sample.bat
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,14 @@ copy /y %ORT_BIN%\qnncpu.dll .
copy /y %ORT_BIN%\QnnHtp.dll .
copy /y %ORT_BIN%\QnnHtpPrepare.dll .
copy /y %ORT_BIN%\QnnHtpV68Stub.dll .
copy /y %ORT_BIN%\QnnHtpV73Stub.dll
IF EXIST %ORT_BIN%\QnnHtpV73Stub.dll (
copy /y %ORT_BIN%\QnnHtpV73Stub.dll .
)
copy /y %ORT_BIN%\QnnSystem.dll .
copy /y %ORT_BIN%\libQnnHtpV68Skel.so .
copy /y %ORT_BIN%\libQnnHtpV73Skel.so
IF EXIST %ORT_BIN%\libQnnHtpV73Skel.so (
copy /y %ORT_BIN%\libQnnHtpV73Skel.so
)
copy /y ..\..\mobilenetv2-12_shape.onnx .
copy /y ..\..\mobilenetv2-12_quant_shape.onnx .
copy /y ..\..\mobilenetv2-12_net_qnn_ctx.onnx .
Expand All @@ -116,7 +120,7 @@ qnn_ep_sample.exe --cpu mobilenetv2-12_shape.onnx kitten_input.raw
REM run mobilenetv2-12_quant_shape.onnx with QNN HTP backend
qnn_ep_sample.exe --htp mobilenetv2-12_quant_shape.onnx kitten_input.raw

REM load mobilenetv2-12_quant_shape.onnx with QNN HTP backend, generate mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx which hs QNN context binary embedded
REM load mobilenetv2-12_quant_shape.onnx with QNN HTP backend, generate mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx which has QNN context binary embedded
REM This does not has to be run on real device with HTP, it can be done on x64 platform also, since it supports offline generation
qnn_ep_sample.exe --htp mobilenetv2-12_quant_shape.onnx kitten_input.raw --gen_ctx

Expand All @@ -126,7 +130,7 @@ IF EXIST mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx (
REM run mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx with QNN HTP backend (generted from previous step)
qnn_ep_sample.exe --htp mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx kitten_input.raw
) ELSE (
ECHO mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx does not exist. It didn't get generated in previous step. Are you using ONNX 1.17+?
ECHO mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx does not exist. It didn't get generated in previous step. Are you using ONNX 1.17+? or build from latest main branch
)


Expand All @@ -140,5 +144,5 @@ exit /b
:HELP
popd
ECHO HELP: run_qnn_ep_sample.bat PATH_TO_ORT_ROOT_WITH_INCLUDE_FOLDER PATH_TO_ORT_BINARIES_WITH_QNN
ECHO Example (Drop): run_qnn_ep_sample.bat %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\build\native %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native
ECHO Example (Drop): run_qnn_ep_sample.bat %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\build\native %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
ECHO Example (Src): run_qnn_ep_sample.bat C:\src\onnxruntime C:\src\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo

0 comments on commit b2be687

Please sign in to comment.