Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
  • Loading branch information
peterchen-intel and tsavina authored Oct 20, 2024
1 parent fe6b3e7 commit 6fea405
Showing 1 changed file with 8 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -304,16 +304,16 @@ mentioned above.
Execution on CPU device
##########################

As mentioned on :ref:`Composability of different threading runtimes <Composability_of_different_threading_runtimes>`, OpenVINO default threading runtime
oneTBB keeps CPU cores actively for a while after inference done. When using Optimum Intel Python API,
it will call Torch (via HF transformers) for postprocessing (for example beam search or gready search).
Torch uses OpenMP for threading, OpenMP will need to wait for CPU cores which are being kept actively by
oneTBB. OpenMP by default has the `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__ which can delay the next OpenVINO inference as well.
As mentioned in the :ref:`Composability of different threading runtimes <Composability_of_different_threading_runtimes>` section, OpenVINO's default threading runtime,
oneTBB, keeps CPU cores active for a while after inference is done. When using Optimum Intel Python API,
it calls Torch (via HF transformers) for postprocessing, such as beam search or gready search.
Torch uses OpenMP for threading, OpenMP needs to wait for CPU cores that are kept active by
oneTBB. By default, OpenMP uses the `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__ which can delay the next OpenVINO inference as well.

The recommendation
It is recommended to:

* Limit the CPU thread number of Torch. `torch.set_num_threads <https://pytorch.org/docs/stable/generated/torch.set_num_threads.html>`__
* Set environment variable `OMP_WAIT_POLICY <https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html>`__ to PASSIVE which will disable OpenMP `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__
* Limit the number of CPU threads used by Torch with `torch.set_num_threads <https://pytorch.org/docs/stable/generated/torch.set_num_threads.html>`__.
* Set the environment variable `OMP_WAIT_POLICY <https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html>`__ to `PASSIVE`, which disables OpenMP `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__.

Additional Resources
#####################
Expand Down

0 comments on commit 6fea405

Please sign in to comment.