Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add GPU optimizations #291

Open
acxz opened this issue May 6, 2020 · 10 comments
Open

[Feature Request] Add GPU optimizations #291

acxz opened this issue May 6, 2020 · 10 comments
Labels
feature New proposed feature

Comments

@acxz
Copy link
Contributor

acxz commented May 6, 2020

Feature

It would be great if a GPU accelerated sparse matrix library can be incorporated into GTSAM. These libraries include: AMD ROCM's rocSPARSE & NVIDIA CUDA's cuSPARSE.

hipSPARSE is a wrapper around both rocSPARSE and cuSPARSE that allows either one to be used as a backend depending on if you have ROCm or CUDA installed.

Motivation

Gotta go fast! NVIDIA mentions that its cuSPARSE provides up to 5x speed over CPU alternatives, see: https://developer.nvidia.com/cusparse

Pitch

Having the option to build GTSAM with hipSPARSE allows GPU accelerated spare matrix calculations to occur with both AMD and NVIDIA devices. This can provide a substantial speedup.

Alternatives

N/A

Additional context

It is interesting to note that @dongjing3309's minisam has the option to use CUDA for its sparse linear solver.

@ProfFan
Copy link
Collaborator

ProfFan commented May 6, 2020

I took a look at minisam, I think they are using cuSPARSE for the Cholesky decomposition step. I think this would be very easy to add to the current codebase, as the only piece of code that needs to be modified is the Cholesky solver (plus some CMake-fu). It should be interesting to benchmark the speedup with some really large datasets.

@dellaert
Copy link
Member

dellaert commented May 7, 2020 via email

@varunagrawal
Copy link
Collaborator

Perhaps Nvidia have improved their cuSparse library significantly by now.

@acxz
Copy link
Contributor Author

acxz commented Jun 7, 2020

Came across this: https://developer.nvidia.com/cholmod

Seems like CHOLMOD already has GPU capability built in, we just need a flag to turn it on.

@ProfFan
Copy link
Collaborator

ProfFan commented Jun 7, 2020

Yes that is on the roadmap :) Should be easy once #111 is merged.

@ProfFan
Copy link
Collaborator

ProfFan commented Jun 9, 2020

@acxz I have added some other solvers like CHOLMOD and cuSparse in #111. However their performance is not great compared to GTSAM's MULTIFRONTAL_CHOLESKY. There should be a lot of room for optimization though I don't have bandwidth on this currently.

@ProfFan
Copy link
Collaborator

ProfFan commented Jun 9, 2020

BTW, CHOLMOD's GPU support is somewhat exaggerated: it only supports GPU cholesky with integers, not doubles.

@acxz
Copy link
Contributor Author

acxz commented Jun 9, 2020

added some other solvers like CHOLMOD and cuSparse

Nice! I am sure the optimizations can come later.

CHOLMOD's GPU support is somewhat exaggerated: it only supports GPU cholesky with integers, not doubles.

rip, makes me sad

@ProfFan
Copy link
Collaborator

ProfFan commented Jun 9, 2020

rip, makes me sad

me bursted into tears after wasting one day building custom SuiteSparse with GPU and figuring out how to make it use the gpu...

@varunagrawal varunagrawal added the feature New proposed feature label Jul 10, 2020
@varunagrawal
Copy link
Collaborator

I spent some time looking into ROCm this past weekend and it looks to be ideal for tackling this issue in a transparent way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New proposed feature
Projects
None yet
Development

No branches or pull requests

4 participants