Skip to content

Commit

Permalink
Adding more reference docs for intel openmp related tunings
Browse files Browse the repository at this point in the history
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
  • Loading branch information
zhouyuan committed Jul 4, 2024
1 parent bcd0c57 commit 458db88
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions Dockerfile.cpu
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,18 @@ RUN apt-get update -y \
&& apt-get install -y git wget vim numactl gcc-12 g++-12 python3 python3-pip libtcmalloc-minimal4 \
&& update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12

RUN pip install accelerate mkl
# https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/tuning_guide.html
# intel-openmp provides additional performance improvement vs. openmp
# tcmalloc provides better memory allocation efficiency, e.g, holding memory in caches to speed up access of commonly-used objects.
RUN pip install intel-openmp

ENV LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4:/usr/local/lib/libiomp5.so:$LD_PRELOAD"

ENV KMP_BLOCKTIME=1
# The time(milliseconds) that a thread should wait after completing the execution of a parallel region, before sleeping.
ENV KMP_BLOCKTIME=1
# Prevents the CPU to run into low performance state
ENV KMP_TPAUSE=0
ENV KMP_SETTINGS=1
# Provides fine granularity parallelism
ENV KMP_FORKJOIN_BARRIER_PATTERN=dist,dist
ENV KMP_PLAIN_BARRIER_PATTERN=dist,dist
ENV KMP_REDUCTION_BARRIER_PATTERN=dist,dist
Expand Down

0 comments on commit 458db88

Please sign in to comment.