Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfix : Add a note about thread oversubscription in AOCL #1472

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

rkamd
Copy link
Contributor

@rkamd rkamd commented Aug 29, 2024

resolves # SWDEV-481179

Add a note in rocBLAS documentation about thread oversubscription in AOCL

References:
flame/blis#588
flame/blis#604
flame/blis#630

* Add a note about thread oversubscription in AOCL

* Apply suggestions from code review

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Copy link
Contributor

@TorreZuk TorreZuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor clarification requested

@@ -900,6 +900,12 @@ There are three client executables that can be used with rocBLAS. They are:

These three clients can be built by following the instructions in the Building and Installing section of the User Guide. After building the rocBLAS clients, they can be found in the directory ``rocBLAS/build/release/clients/staging``.

.. note::
The ``rocblas-bench`` and ``rocblas-test`` executables use AMD's ILP64 version of AOCL-BLAS 4.2 as the host reference BLAS to verify correctness. However, there is a known issue with AOCL-BLAS that can cause these executables to hang. This problem can arise because the AOCL-BLAS library launches multiple threads to perform computations. If the number of threads matches the total number of CPU threads, it can lead to thread oversubscription, causing the program to hang.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed this earlier "total number of CPU threads" -> "total number of CPU logical cores"

@rkamd rkamd merged commit d27f7bd into release-staging/rocm-rel-6.3 Aug 29, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants