Skip to content

Decouple number of OpenBLAS threads and max number of OpenMP threads #2985

Closed
@Flamefire

Description

@Flamefire

It is possible to set the number of threads used by OpenBLAS via openblas_set_num_threads. For the "custom thread" solution this works quite well: Independent of what the application may do the set number of threads is used inside OpenBLAS.

However for OpenMP this is not the case: An application might want to have OpenBLAS use 4 of 16 threads while using OpenMP itself to schedule other work or run 4 OpenBLAS operations in parallel each using 4 threads (up to the runtime if that is even possible, but the first use case should be). Another use case would be that OpenBLAS should use only 4 threads (e.g. due to performance reasons, usual matrix size, ...) but the application wants to use OpenMP (at other times, so not in parallel to OpenBLAS) with all 16 threads.

Now OpenBLAS does something nasty: It uses the max number of openmp threads and sets the max number of used threads to that value. So it is impossible to use less than the number of OpenMP threads.

In code the problem is 2-fold:

So for a first fix I'd suggest to make num_cpu_avail return the lesser of blas_cpu_number and openmp_nthreads instead of setting anything.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions