You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Alternatively, you can use `buck2` to run the `.pte` file with XNNPACK delegate instructions in it on your host platform. You can follow the instructions here to install [buck2](getting-started-setup.md#Build-&-Run). You can now run it with the prebuilt `xnn_executor_runner` provided in the examples. This will run the model on some sample inputs.
175
-
176
-
```bash
177
-
buck2 run examples/xnnpack:xnn_executor_runner -- --model_path ./mv2_xnnpack_fp32.pte
178
-
# or to run the quantized variant
179
-
buck2 run examples/xnnpack:xnn_executor_runner -- --model_path ./mv2_xnnpack_q8.pte
180
-
```
181
-
182
172
## Building and Linking with the XNNPACK Backend
183
-
You can build the XNNPACK backend [BUCK target](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/targets.bzl#L54) and [CMake target](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/CMakeLists.txt#L83), and link it with your application binary such as an Android or iOS application. For more information on this you may take a look at this [resource](demo-apps-android.md) next.
173
+
You can build the XNNPACK backend [CMake target](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/CMakeLists.txt#L83), and link it with your application binary such as an Android or iOS application. For more information on this you may take a look at this [resource](demo-apps-android.md) next.
Once we have the model binary (pte) file, then let's run it with ExecuTorch runtime using the `xnn_executor_runner`.
28
-
29
-
```bash
30
-
buck2 run examples/xnnpack:xnn_executor_runner -- --model_path ./mv2_xnnpack_fp32.pte
31
-
```
32
-
33
-
For cmake, you first configure your cmake with the following:
27
+
Once we have the model binary (pte) file, then let's run it with ExecuTorch runtime using the `xnn_executor_runner`. With cmake, you first configure your cmake with the following:
34
28
35
29
```bash
36
30
# cd to the root of executorch repo
@@ -71,11 +65,7 @@ Here we will discuss quantizing a model suitable for XNNPACK delegation using XN
71
65
72
66
Though it is typical to run this quantized mode via XNNPACK delegate, we want to highlight that this is just another quantization flavor, and we can run this quantized model without necessarily using XNNPACK delegate, but only using standard quantization operators.
73
67
74
-
A shared library to register the out variants of the quantized operators (e.g., `quantized_decomposed::add.out`) into EXIR is required. To generate this library, run the following command if using `buck2`:
Or if on cmake, follow the instructions in `test_quantize.sh` to build it, the default path is `cmake-out/kernels/quantized/libquantized_ops_lib.so`.
68
+
A shared library to register the out variants of the quantized operators (e.g., `quantized_decomposed::add.out`) into EXIR is required. On cmake, follow the instructions in `test_quantize.sh` to build it, the default path is `cmake-out/kernels/quantized/libquantized_ops_lib.so`.
79
69
80
70
Then you can generate a XNNPACK quantized model with the following command by passing the path to the shared library into the script `quantization/example.py`:
81
71
```bash
@@ -88,23 +78,42 @@ You can find more valid quantized example models by running:
A quantized model can be run via `executor_runner`:
81
+
## Running the XNNPACK Model with CMake
82
+
After exporting the XNNPACK Delegated model, we can now try running it with example inputs using CMake. We can build and use the xnn_executor_runner, which is a sample wrapper for the ExecuTorch Runtime and XNNPACK Backend. We first begin by configuring the CMake build like such:
92
83
```bash
93
-
buck2 run examples/portable/executor_runner:executor_runner -- --model_path ./mv2_quantized.pte
94
-
```
95
-
Please note that running a quantized model will require the presence of various quantized/dequantize operators in the [quantized kernel lib](../../kernels/quantized).
84
+
# cd to the root of executorch repo
85
+
cd executorch
96
86
87
+
# Get a clean cmake-out directory
88
+
rm- -rf cmake-out
89
+
mkdir cmake-out
97
90
98
-
## Delegating a Quantized Model
91
+
# Configure cmake
92
+
cmake \
93
+
-DCMAKE_INSTALL_PREFIX=cmake-out \
94
+
-DCMAKE_BUILD_TYPE=Release \
95
+
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
96
+
-DEXECUTORCH_BUILD_XNNPACK=ON \
97
+
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
98
+
-DEXECUTORCH_ENABLE_LOGGING=ON \
99
+
-DPYTHON_EXECUTABLE=python \
100
+
-Bcmake-out .
101
+
```
102
+
Then you can build the runtime componenets with
99
103
100
-
The following command will produce a XNNPACK quantized and delegated model `mv2_xnnpack_q8.pte` that can be run using XNNPACK's operators. It will also print out the lowered graph, showing what parts of the models have been lowered to XNNPACK via `executorch_call_delegate`.
Now you should be able to find the executable built at `./cmake-out/backends/xnnpack/xnn_executor_runner` you can run the executable with the model you generated as such
Once we have the model binary (pte) file, then let's run it with ExecuTorch runtime using the `xnn_executor_runner`.
113
+
## Delegating a Quantized Model
114
+
115
+
The following command will produce a XNNPACK quantized and delegated model `mv2_xnnpack_q8.pte` that can be run using XNNPACK's operators. It will also print out the lowered graph, showing what parts of the models have been lowered to XNNPACK via `executorch_call_delegate`.
107
116
108
117
```bash
109
-
buck2 run examples/xnnpack:xnn_executor_runner -- --model_path ./mv2_xnnpack_q8.pte
0 commit comments