Apply suggestions from code review

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
openvinotoolkit · Oct 20, 2024 · 6fea405 · 6fea405
1 parent fe6b3e7
commit 6fea405
Showing 1 changed file with 8 additions and 8 deletions.
diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/llm-inference-hf.rst b/docs/articles_en/learn-openvino/llm_inference_guide/llm-inference-hf.rst
@@ -304,16 +304,16 @@ mentioned above.
 Execution on CPU device
 ##########################
 
-As mentioned on :ref:`Composability of different threading runtimes <Composability_of_different_threading_runtimes>`, OpenVINO default threading runtime
-oneTBB keeps CPU cores actively for a while after inference done. When using Optimum Intel Python API,
-it will call Torch (via HF transformers) for postprocessing (for example beam search or gready search).
-Torch uses OpenMP for threading, OpenMP will need to wait for CPU cores which are being kept actively by
-oneTBB. OpenMP by default has the `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__ which can delay the next OpenVINO inference as well.
+As mentioned in the :ref:`Composability of different threading runtimes <Composability_of_different_threading_runtimes>` section, OpenVINO's default threading runtime,
+oneTBB, keeps CPU cores active for a while after inference is done. When using Optimum Intel Python API,
+it calls Torch (via HF transformers) for postprocessing, such as beam search or gready search.
+Torch uses OpenMP for threading, OpenMP needs to wait for CPU cores that are kept active by
+oneTBB. By default, OpenMP uses the `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__ which can delay the next OpenVINO inference as well.
 
-The recommendation
+It is recommended to:
 
-* Limit the CPU thread number of Torch. `torch.set_num_threads <https://pytorch.org/docs/stable/generated/torch.set_num_threads.html>`__
-* Set environment variable `OMP_WAIT_POLICY <https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html>`__ to PASSIVE which will disable OpenMP `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__
+* Limit the number of CPU threads used by Torch with `torch.set_num_threads <https://pytorch.org/docs/stable/generated/torch.set_num_threads.html>`__.
+* Set the environment variable `OMP_WAIT_POLICY <https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html>`__ to `PASSIVE`, which disables OpenMP `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__.
 
 Additional Resources
 #####################