@@ -138,8 +138,6 @@ the approximate factorizations.
138
138
139
139
> cmake ../ -DSTRUMPACK_USE_CUDA=ON
140
140
141
- Enabled by default, but turned off if CUDA cannot be found.
142
-
143
141
CMake will look for the CUDA compiler and libraries (cuBLAS and
144
142
cuSOLVER) in the default location or at CUDAToolkit_ROOT, which can be
145
143
set as
@@ -151,6 +149,26 @@ For full GPU support in the distributed memory sparse direct solver,
151
149
one should also compile with support for SLATE, see below.
152
150
153
151
152
+
153
+ HIP (optional)
154
+ ===============
155
+ To enable support for HIP in the sparse solver:
156
+
157
+ > export HIP_DIR=...
158
+ > cmake ../ \
159
+ -DSTRUMPACK_USE_HIP=ON \
160
+ -DCMAKE_HIP_ARCHITECTURES=gfx90a \
161
+ -DCMAKE_CXX_COMPILER=hipcc
162
+
163
+ In the above, adjust the HIP_DIR, the GPU architecture, and the HIP
164
+ compiler.
165
+
166
+ For full GPU support in the distributed memory sparse direct solver,
167
+ one should also compile with support for SLATE, with the HIP backend,
168
+ see below.
169
+
170
+
171
+
154
172
SLATE (optional) for GPU accelerated ScaLAPACK
155
173
==============================================
156
174
SLATE is a modern ScaLAPACK alternative, bringing support for GPU
@@ -162,6 +180,16 @@ Support for SLATE in STRUMPACK can be enabled with for instance:
162
180
> -DTPL_SLATE_INCLUDE_DIRS="$SLATEHOME/include/;$SLATEHOME/blaspp/include;$SLATEHOME/lapackpp/include" \
163
181
> -DTPL_SLATE_LIBRARIES="$SLATEHOME/lib/libslate_scalapack_api.so;$SLATEHOME/lib/libslate.so;$SLATEHOME/blaspp/lib/libblaspp.so;$SLATEHOME/lapackpp/lib/liblapackpp.so"
164
182
183
+ Or if you simply define SLATE_DIR to point to the SLATE and blas++ and
184
+ lapack++ installation directories, STRUMPACK's CMake should find them
185
+ and you do not need to specify TPL_SLATE_INCLUDE_DIRS and
186
+ TPL_SLATE_LIBRARIES.
187
+
188
+ Note that SLATE requires MPI_THREAD_MULTIPLE. So you need to
189
+ initialize MPI with MPI_Init_thread with the required argument set to
190
+ MPI_THREAD_MULTIPLE.
191
+
192
+
165
193
166
194
ParMETIS and (PT)Scotch (optional)
167
195
==================================
0 commit comments