Skip to content

Commit 13c4802

Browse files
committed
up
1 parent 531d8ca commit 13c4802

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

examples/models/llama/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -382,7 +382,7 @@ Please refer to [this tutorial](https://pytorch.org/executorch/main/llm/llama-de
382382
383383
## Running with low-bit kernels
384384
385-
We now give instructions for quantizating and running your model with low-bit kernels. These are still experimental, and require you do development on an Arm-based Mac. Also note that low-bit quantization often requires QAT (quantization-aware training) to give good quality results.
385+
We now give instructions for quantizating and running your model with low-bit kernels. These are still experimental, and require you do development on an Arm-based Mac. Also note that low-bit quantization often requires QAT (quantization-aware training) to give good quality results. Currently dynamic shapes must be disabled when exporting a model with these kernels.
386386
387387
First export your model for lowbit quantization (step 2 above):
388388

install_requirements.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ def install_requirements(use_pytorch_nightly):
119119
# Install packages directly from local copy instead of pypi.
120120
# This is usually not recommended.
121121
new_env = os.environ.copy()
122-
new_env["USE_CPP"] = "1"
122+
new_env["USE_CPP"] = "1" # install torchao kernels
123123
subprocess.run(
124124
[
125125
sys.executable,

0 commit comments

Comments
 (0)