Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low performance of rocHPL compiled with Clang #12

Open
Felloty opened this issue Aug 21, 2024 · 0 comments
Open

Low performance of rocHPL compiled with Clang #12

Felloty opened this issue Aug 21, 2024 · 0 comments

Comments

@Felloty
Copy link

Felloty commented Aug 21, 2024

Hello there!

I'm having a performance issue while running the rocHPL benchmark, which was compiled with the ROCm hipcc CXX compiler.

I have compiled two versions of rocHPL, using two different compilers:
one version was compiled with the default GNU 7.5.0 CXX compiler,
and the other was compiled with Clang 15.0.0, which is hipcc taken from the rocm/5.3.3/hip/bin directory.

For the first version (compiled with GNU), the basic install.sh script was used, along with the --with-rocm option to specify the rocm/5.3.3 directory, from which rocBLAS was also used. Additionally, the --with-mpi option was used to specify my previously installed OpenMPI.

The second version (compiled with Clang) was compiled using a modified version of the install.sh script, with
-DCMAKE_CXX_COMPILER=hipcc
in the cmake_common_options variable.

The other options for the install.sh script were the same as the first version.

When I run my benchmarks on the AMD Radeon Instinct MI50 32G with the same HPL.dat configuration, I get different results.

The GFLOPS achieved with hipcc rocHPL is always lower than that achieved with gnu rocHPL.

For example, here are the performance results for different configurations of the N parameter using the same HPL.dat file:

N  P   Q  VRAM  GFLOPS (gnu)  GFLOPS (hipcc)
45312     1  1  51%  3.687e+03  2.596e+03
54912     1  1  74%  4.143e+03  3.181e+03
62512     1  1  95%  4.345e+03  3.573e+03
63512     1  1  98%  4.387e+03  3.588e+03

What causes this to happen?
How can I correctly compile rocHPL using hipcc and avoid performance issues during benchmarking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant