-
Notifications
You must be signed in to change notification settings - Fork 23.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tensorexpr][nnc] Support quantization #66676
Conversation
[ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow For more information, please take a look at the CI Flow Wiki. |
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit ff4c203 (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D31676329](https://our.internmc.facebook.com/intern/diff/D31676329) 1. Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8 2. Introducing quantization lowerings for: ``` aten::quantize_per_tensor aten::dequantize _quantized::conv2d quantized::conv2d_relu quantized::add aten::upsample_nearest2d ``` `quantize_per_tensor` and `dequantize` can be lowered as tensorexpr or external calls. all other ops are lowered only as external calls. Testing: ``` #!/bin/bash PYTORCH_JIT_LOG_LEVEL=">>kernel:>>eval" \ PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 \ USE_PYTORCH_QNNPACK=1 \ USE_QNNPACK=1 \ USE_FBGEMM=0 \ DEBUG=1 \ CXX_FLAGS="-g" \ USE_LLVM=/home/ivankobzarev/llvm90install \ USE_XNNPACK=1 \ ./scripts/build_local.sh \ -DBUILD_BINARY=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DBUILD_TEST=ON \ -DUSE_LLVM=/home/ivankobzarev/llvm90install \ && ./build/bin/test_tensorexpr --gtest_filter="Quantization.*" ``` [ghstack-poisoned]
@IvanKobzarev has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@IvanKobzarev merged this pull request in 7fbcf79. |
Stack from ghstack:
Differential Revision: D31676329
Introuducing quantized dtypes to tensorexpr: QUInt8, QInt8
Introducing quantization lowerings for:
quantize_per_tensor
anddequantize
can be lowered as tensorexpr or external calls.all other ops are lowered only as external calls.
Testing: