You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/backends-xnnpack.md
+13-4Lines changed: 13 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -28,6 +28,7 @@ the core ExecuTorch runtime.
28
28
To target the XNNPACK backend during the export and lowering process, pass an instance of the `XnnpackPartitioner` to `to_edge_transform_and_lower`. The example below demonstrates this process using the MobileNet V2 model from torchvision.
29
29
30
30
```python
31
+
import torch
31
32
import torchvision.models as models
32
33
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
33
34
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
@@ -47,10 +48,10 @@ with open("mv2_xnnpack.pte", "wb") as file:
47
48
48
49
### Partitioner API
49
50
50
-
The XNNPACK partitioner API allows for configuration of the model delegation to XNNPACK. Passing an `XnnpackPartitioner` instance with no additional parameters will run as much of the model as possible on the XNNPACK backend. This is the most common use-case. For advanced use cases, the partitioner exposes the following options via the [constructor](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/xnnpack/partition/xnnpack_partitioner.py#L31):
51
+
The XNNPACK partitioner API allows for configuration of the model delegation to XNNPACK. Passing an `XnnpackPartitioner` instance with no additional parameters will run as much of the model as possible on the XNNPACK backend. This is the most common use-case. For advanced use cases, the partitioner exposes the following options via the [constructor](https://github.com/pytorch/executorch/blob/release/0.6/backends/xnnpack/partition/xnnpack_partitioner.py#L31):
51
52
52
-
-`configs`: Control which operators are delegated to XNNPACK. By default, all available operators all delegated. See [../config/\_\_init\_\_.py](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/xnnpack/partition/config/__init__.py#L66) for an exhaustive list of available operator configs.
53
-
-`config_precisions`: Filter operators by data type. By default, delegate all precisions. One or more of `ConfigPrecisionType.FP32`, `ConfigPrecisionType.STATIC_QUANT`, or `ConfigPrecisionType.DYNAMIC_QUANT`. See [ConfigPrecisionType](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/xnnpack/partition/config/xnnpack_config.py#L24).
53
+
-`configs`: Control which operators are delegated to XNNPACK. By default, all available operators all delegated. See [../config/\_\_init\_\_.py](https://github.com/pytorch/executorch/blob/release/0.6/backends/xnnpack/partition/config/__init__.py#L66) for an exhaustive list of available operator configs.
54
+
-`config_precisions`: Filter operators by data type. By default, delegate all precisions. One or more of `ConfigPrecisionType.FP32`, `ConfigPrecisionType.STATIC_QUANT`, or `ConfigPrecisionType.DYNAMIC_QUANT`. See [ConfigPrecisionType](https://github.com/pytorch/executorch/blob/release/0.6/backends/xnnpack/partition/config/xnnpack_config.py#L24).
54
55
-`per_op_mode`: If true, emit individual delegate calls for every operator. This is an advanced option intended to reduce memory overhead in some contexts at the cost of a small amount of runtime overhead. Defaults to false.
55
56
-`verbose`: If true, print additional information during lowering.
56
57
@@ -87,15 +88,23 @@ To perform 8-bit quantization with the PT2E flow, perform the following steps pr
87
88
The output of `convert_pt2e` is a PyTorch model which can be exported and lowered using the normal flow. As it is a regular PyTorch model, it can also be used to evaluate the accuracy of the quantized model using standard PyTorch techniques.
88
89
89
90
```python
91
+
import torch
92
+
import torchvision.models as models
93
+
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
90
94
from executorch.backends.xnnpack.quantizer.xnnpack_quantizer import XNNPACKQuantizer
95
+
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
96
+
from executorch.exir import to_edge_transform_and_lower
91
97
from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
92
98
from torch.ao.quantization.quantizer.xnnpack_quantizer import get_symmetric_quantization_config
93
99
100
+
model = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
0 commit comments