Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[pt][quant] Parallelize quantize and dequantize (pytorch#33765)
Summary: Pull Request resolved: pytorch#33765 quantize and dequantize methods now make use of multiple threads. This makes use of shz0116's recent parallelization of quantize/dequantize routines in FBGEMM. Fixes: pytorch#32006 pytorch/FBGEMM#142 Alternative to pytorch#30153 ``` #!/usr/bin/env python import time import torch import torch.nn as nn torch.set_num_threads(4) # print(torch.__config__.parallel_info()) W = torch.rand(1, 54, 54, 256) NITER = 1000 s = time.time() for i in range(NITER): W_q = torch.quantize_per_tensor(W, scale=1.0, zero_point = 0, dtype=torch.quint8) time_per_iter = (time.time() - s) / NITER print('quantize time per iter ms', time_per_iter * 1000) s = time.time() for i in range(NITER): W_deq = W_q.dequantize() time_per_iter = (time.time() - s) / NITER print('dequantize time per iter ms', time_per_iter * 1000) ``` ### With 1 thread quantize time per iter ms 0.22633790969848633 dequantize time per iter ms 0.6573665142059326 ### With 4 threads quantize time per iter ms 0.0905618667602539 dequantize time per iter ms 0.19511842727661133 ghstack-source-id: 98935895 Test Plan: python test/test_quantized.py Reviewed By: jspark1105 Differential Revision: D20098521 fbshipit-source-id: bd8c45761b4651fcd5b20b95759e3868a136c048
- Loading branch information