Skip to content

Commit 36c3192

Browse files
EikanWangashahbachunyuan-wleslie-fang-intelJianpingChen066
authored
Merge external github code (#249)
* Add AVX512 macro in CMake to enable AVX512 * Cannot use the input dil tensor to check is_public_format or not because it out of scope * Fix build issue of PR #20 * Increase precision tolerance for ut * Update for new 'oneDNN' GitHub URL (#146) * Update default IPEX version to 1.2.0 * fallback to cpu for LSTM training with dropout * Parse pytorch 1.8 registrationdeclarition.h to gen dense operators code and sparse operators code * git commit -m * 1. Replace TensorList by c10::List 2. Replace tensor size and stride by SizesAndStrides TODO: Needs to workaround the RegXXX.h that the function sig conflicts with NativeFunctions.h * remove autocast from master * Pass build for pytorch 1.8 TODO: Add comments for gen-dense-cpu-ops.py There might be potential issues for grad copying * Enhance embedding bag last offset memory copy by using parallelized move_ker * add UT for int8 LSTM * add asymmetric quantization * enable int8 for LSTM * Port utils for ut from PyTorch 1.8 * Fix the issue that cannot fallback the tensor list wrapped by c10::list * Enable upsample_bilinear2d to support the scale factor is vector * Update README to clarify the IPEX version and PyTorch Update the IPEX version in setup.py to 1.2.0 * enable bf16 layernorm * Enable native layer norm signature matching * Pass all the test cases of the committed test file except layer_norm. Because IPEX cannot capture the layer_norm. * Capture layernorm on python side * Replace ATen/Tensor.h to ATen/ATen.h to avoid undefined symbol Conflicts: torch_ipex/csrc/utils.h * Gen sparse operators * Reorder to publice for slice in case throwing exception * 1. Support NHWC 2. Remove recorder tensors to reduce pytorch profiler overhead * 1. dependencies installation; 2. torch wheel file query and packaging; 3. doesn't require git anymore when compiling * Added tutorial Performance Tuning.md in directory tutorials * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * update test_torch.py and align with common_utils.py * bug fix in dockerfile (#164) * Update Dockerfile to include pybind11-dev (#157) As a fix for issue - #155. As suggested by @jingxu10, adding pybind11-dev allows for a successful build of the Docker container. * fix pt-1.8's UT * - Installation for IPEX-1.8, to remove the recompilation for PT and add the installation for dependency package. - Add the supported customized ops & fusion patterns. * tmp commit * pass most UT * modified _C.cpython.xxxx.so's rpath * fix unexpected keyword argument 'prec' in test_torch.py * Keep intel_pytorch_extension to ensure backward-compatibility * fix test_int8.py's regression * update the version to 1.8.0 * fix runtime undefined reference error caused by libstdc++ Dual ABI * Updated README.md for v1.8.0 * Updated torch-ccl to fix libfabric.so not found issue * setup.py: 1. fix include_paths and library_paths missing issue if torch is installed via setup.py. 2. sovled libstdc++ dual abi issue. 3. removed duplicated package importings. torch-ccl: 1. fixed oneCCL library path patching not taking effect issue * Update README.md * clean ipex installation folder structure * clean ipex installation folder structure * clean ipex installation folder structure * Add a warning message of deprecation of intel_pytorch_extension * fix rpath issue to libtorch_ccl.so after hierarchy adjustment * 1. removed execute bit of libtorch_ipex.so permission 2. upgraded torch-ccl to make libtorch_ccl.so installed to torch_ccl folder * Pass build for pytorch 1.9.0 * Enable batch_norm operator * update ipex Dockerfile to use no-patch version (#170) * update ipex Dockerfile to use no-patch version * explicit pytorch version * Exclude the operators that do not run into autograd * Pass all test cases except test_torch * Fix the issues 1. LSTM indents error 2. Check batch_normalization * Fix the issue that the grad of nll_loss input is none * update build version from 1.8.0.1 to 1.9.0 (along with pytorch version) * fix dil_cat bug when concating empty tensors with customized shape * 1. moved python codes out from libtorch_ipex.so to _C.so 2. removed pybind11 as denpendency library from third_party folder 3. changed "import intel_pytorch_extension" to "import torch_ipex" in tests folder, Readme.md, torch_ipex/ops/embeddingbag.py and torch_ipex/launch.py 4. commented "core.enable_torch_ccl()" out in torch_ipex/__init__.py, to avoid the following error when "import torch_ipex" Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/jingxu1/dl/pytorch/srcs/venv_test_py38/lib/python3.8/site-packages/torch_ipex/__init__.py", line 14, in <module> core.enable_torch_ccl() RuntimeError: arg(): could not convert default argument into a Python object (type not registered yet?). Compile in debug mode for more information. * 1. removed torch-ccl 2. added debug info into version.py 3. removed pytorch wheel file binding in debug mode * updated dockerfile to 1.9.0 * removed core.enable_torch_ccl() * updated README.md for 1.9.0 * updated README.md for 1.9.0 * updated .gitignore to delete torch_ipex/version.py when performing clean * V1.8.0 whl release (#171) * Added wheel file release info to README.md * Added wheel file release info to README.md * Exclude flatten.using_ints and cross_entropy_loss because the two operators do not generate backward functions * Does not capture batch_norm and _batch_norm_impl_index * Exclude reshape and where * Exclude nll_loss2d * added denormal numbers section to performance_tuning.md * Add installation guide for 1.9.0 * Add installation guide for 1.9.0 * Update README.md The default IPEX and PyTorch versions are v1.9.0 * added avx512 note * updated launch.py * added launcher doc * added launcher doc * Add python interface c++ source file * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update LICENSE.txt * Update README.md * Remove useless files * Fix format issue Co-authored-by: Abolfazl Shahbazi <abolfazl.shahbazi@intel.com> Co-authored-by: chunyuan-w <chunyuan.wu@intel.com> Co-authored-by: leslie-fang-intel <leslie.fang@intel.com> Co-authored-by: Chen, Jian Ping <jian.ping.chen@intel.com> Co-authored-by: jiayisun <jiayi.sun@intel.com> Co-authored-by: Jing Xu <jing.xu@intel.com> Co-authored-by: Zhu, Jewel <jewel.zhu@intel.com> Co-authored-by: tangleintel <lei1.tang@intel.com> Co-authored-by: Chaitanya Hazarey <C24IO@users.noreply.github.com> Co-authored-by: Ashok Emani <ashok.emani@intel.com> Co-authored-by: Wang, Eikan <root@JF5300-B11A316T.jf.intel.com> Co-authored-by: jianangu <jianan.gu@intel.com>
1 parent 7fb53ba commit 36c3192

27 files changed

+626
-2305
lines changed

LICENSE.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,7 @@
187187
same "printed page" as the copyright notice for easier
188188
identification within third-party archives.
189189

190-
Copyright [yyyy] [name of copyright owner]
190+
Copyright 2020-2021 Intel Corporation
191191

192192
Licensed under the Apache License, Version 2.0 (the "License");
193193
you may not use this file except in compliance with the License.

README.md

Lines changed: 258 additions & 164 deletions
Large diffs are not rendered by default.

cmake/Modules/FindTorchCCL.cmake

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,10 @@ SET(TORCHCCL_INCLUDE_DIR)
1717

1818
SET(TORCHCCL_ROOT "${PROJECT_SOURCE_DIR}/third_party/torch_ccl")
1919

20+
SET(CMAKE_INSTALL_PREFIX_SAVED "${CMAKE_INSTALL_PREFIX}")
21+
SET(CMAKE_INSTALL_PREFIX "${CMAKE_INSTALL_PREFIX_SAVED}/../torch_ccl")
2022
ADD_SUBDIRECTORY(${TORCHCCL_ROOT})
23+
SET(CMAKE_INSTALL_PREFIX "${CMAKE_INSTALL_PREFIX_SAVED}")
2124
IF(NOT TARGET torch_ccl)
2225
MESSAGE(FATAL_ERROR "Failed to include torch_ccl target")
2326
ENDIF()

docker/Dockerfile

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# syntax = docker/dockerfile:experimental
22
# based onhttps://github.com/pytorch/pytorch/blob/master/Dockerfile
3-
#
3+
#
44
# NOTE: To build this you will need a docker version > 18.06 with
55
# experimental enabled and DOCKER_BUILDKIT=1
66
#
77
# If you do not use buildkit you are not going to have a good time
88
#
9-
# For reference:
9+
# For reference:
1010
# https://docs.docker.com/develop/develop-images/build_enhancements/
1111

1212
ARG BASE_IMAGE=ubuntu:20.04
@@ -20,11 +20,13 @@ RUN --mount=type=cache,id=apt-dev,target=/var/cache/apt \
2020
vim \
2121
build-essential \
2222
ccache \
23-
libjemalloc-dev \
23+
libgoogle-perftools-dev \
2424
numactl \
2525
cmake \
2626
libjpeg-dev \
27+
pybind11-dev \
2728
libpng-dev \
29+
pybind11-dev \
2830
&& rm -rf /var/lib/apt/lists/*
2931
RUN /usr/sbin/update-ccache-symlinks
3032
RUN mkdir /opt/ccache && ccache --set-config=cache_dir=/opt/ccache
@@ -40,30 +42,30 @@ RUN curl -fsSL -v -o ~/miniconda.sh -O https://repo.anaconda.com/miniconda/Mini
4042
/opt/conda/bin/conda clean -ya
4143

4244
FROM dev-base AS build
45+
ARG IPEX_VERSION=v1.9.0
46+
ARG PYTORCH_VERSION=v1.9.0
47+
ARG TORCHVISION_VERSION=0.10.0+cpu
48+
ARG TORCHAUDIO_VERSION=0.9.0
4349
COPY --from=conda /opt/conda /opt/conda
44-
ARG TORCHVISION_VERSION=0.6
4550
RUN --mount=type=cache,target=/opt/ccache \
46-
pip install torchvision==${TORCHVISION_VERSION}+cpu --no-deps \
47-
-f https://download.pytorch.org/whl/torch_stable.html && \
48-
pip install lark-parser hypothesis && \
51+
pip install torch==${PYTORCH_VERSION}+cpu torchvision==${TORCHVISION_VERSION} torchaudio==${TORCHAUDIO_VERSION} -f https://download.pytorch.org/whl/torch_stable.html && \
4952
git clone https://github.com/intel/intel-extension-for-pytorch && \
50-
cd intel-extension-for-pytorch && git submodule sync && \
51-
git submodule update --init --recursive && \
52-
git clone https://github.com/pytorch/pytorch && \
53-
cd pytorch && git checkout v1.7.0 && git submodule sync && \
53+
cd intel-extension-for-pytorch && \
54+
git checkout ${IPEX_VERSION} && \
55+
git submodule sync && \
5456
git submodule update --init --recursive && \
55-
git apply ../torch_patches/xpu-1.7.patch && \
56-
USE_MKLDNN=1 USE_CUDA=0 USE_NNPACK=0 USE_CUDNN=0 \
57-
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" pip install -v . && \
58-
cd .. && pip install -v . && rm -rf *
57+
pip3 install -r requirements.txt && \
58+
python setup.py bdist_wheel && \
59+
pip3 install dist/*.whl && \
60+
cd .. && rm -rf intel-extension-for-pytorch
5961

6062
FROM dev-base as dev
6163
COPY --from=build /opt/conda /opt/conda
6264
ARG OMP_NUM_THREADS=1
6365
ENV OMP_NUM_THREADS ${OMP_NUM_THREADS}
6466
ARG KMP_BLOCKTIME=1
65-
ENV KMP_BLOCKTIME ${KMP_BLOCKTIME}
67+
ENV KMP_BLOCKTIME ${KMP_BLOCKTIME}
6668
ARG KMP_HW_SUBSET=1T
6769
ENV KMP_HW_SUBSET ${KMP_HW_SUBSET}
68-
ENV MALLOC_CONF "oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:-1,muzzy_decay_ms:-1"
69-
ENV LD_PRELOAD "/opt/conda/lib/libiomp5.so /usr/lib/x86_64-linux-gnu/libjemalloc.so"
70+
ENV LD_PRELOAD "/opt/conda/lib/libiomp5.so /usr/lib/x86_64-linux-gnu/libtcmalloc.so"
71+
ENV LD_LIBRARY_PATH "/opt/conda/lib/python3.8/site-packages/lib/"

docker/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,5 @@
1111
```console
1212
$ cd $DOCKERFILE_DIR
1313
$ DOCKER_BUILDKIT=1 docker build -t intel-extension-for-pytorch:test .
14+
$ docker run intel-extension-for-pytorch:test python -c "import torch;import intel_pytorch_extension as ipex;print('torch:', torch.__version__,' ipex:',ipex.__version__)"
1415
```

ideep/ideep/abstract_types.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ using key_t = std::string;
5252
#endif
5353

5454
const scale_t IDEEP_DEF_SCALE {1.0f};
55+
const std::vector<int32_t> DIL_DEF_ZERO_POINT{0};
5556

5657
enum lowp_kind {
5758
u8s8 = 0,

ideep/ideep/attributes.hpp

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,35 @@ struct attr_t : public dnnl::primitive_attr {
1414

1515
attr_t(int mask, const scale_t& scales) { set_output_scales(mask, scales); }
1616

17+
/* TODO: for rnn input quantization with scale + shift from f32 to u8
18+
Failed to use it in IPEX since:
19+
x_aten is in ntc and is an aten tensor
20+
x_dil = x_aten.transpose(0,1)
21+
x_dil will become a dil tensor
22+
x_dil_storage = try_gen_dil_storage(x_dil)
23+
x_dil_storage will have the stride that corresponds to an ntc format
24+
When we use set_rnn_data_qparams on x_dil_storage, cannot pass the format
25+
check
26+
*/
27+
attr_t(float scale, float shift) {
28+
set_rnn_data_qparams(scale, shift);
29+
}
30+
31+
attr_t(
32+
const scale_t& scales,
33+
const std::vector<int32_t>& shift,
34+
bool rnn_data_quantize) {
35+
set_output_scales(0, scales);
36+
if (rnn_data_quantize) {
37+
// Workaround: for rnn input quantization with scale + shift from f32 to
38+
// u8
39+
set_zero_points(DNNL_ARG_DST, 0, shift);
40+
} else {
41+
// for rnn input dequantization with scale + shift from u8 to f32
42+
set_zero_points(DNNL_ARG_SRC, 0, shift);
43+
}
44+
}
45+
1746
std::pair<scale_t, int> get_output_scales() const {
1847
dnnl_dim_t count;
1948
int c_mask;

scripts/cpu/common/__init__.py

Whitespace-only changes.

scripts/cpu/common/aten_sig_parser.py

Lines changed: 0 additions & 187 deletions
This file was deleted.

scripts/cpu/common/codegen.py

Lines changed: 0 additions & 15 deletions
This file was deleted.

0 commit comments

Comments
 (0)