OpenCL seems to almost work #48

Happenedtostumblein · 2023-09-05T18:35:35Z

I do not know cpp and do not have a solid grasp on how ggml works. , but building the repo with cmake -dggml_clblast=ON seems to work as the GPU utilization goes up and it’s very fast (10s vs 80s per step on a higher end CPU). It does complete all the steps and completes sampling too, but then crashes at line 1505 of ggml-opencl.

If it is a matter of spending time to make this work, is it simple enough for one of you to explain what needs to be done? If so, would be happy to give it a shot but don’t know where to start.

My limited understanding is that sampling is what takes all the effort, so is there a way to maybe switch from GPU to CPU to save the file? Or am I missing some context/knowledge?

Edit: Fixed typo. Flag used is clblast, not openblas.

ggerganov · 2023-09-05T18:41:42Z

Try this patch: ggerganov/llama.cpp@6460f75

Happenedtostumblein · 2023-09-05T18:54:48Z

@ggerganov That worked, thank you!

Is it proper protocol to submit a pull request for a one-liner?

Edit: FYI: It allows entire process to complete, but does not actually make use of GPU.

Happenedtostumblein · 2023-09-05T19:20:56Z

FYI: It does work, but GPU utilization is very low. Got any more simple speedups in your pocket? @ggerganov

daniandtheweb · 2023-09-05T22:50:08Z

I'm sorry to disappoint you but openblas doesn't use the gpu to accelerate the processing but it uses the cpu itself. If anything you should try DGGML-CLBLAST=ON in order to use OpenCL but it still wouldn't work as the developer still hasn't integrated any gpu acceleration into the program.

Happenedtostumblein · 2023-09-05T23:02:10Z

@daniandtheweb Thanks for pointing that out…it was a typo and the CLBLAST flag is what I was referring to.

How difficult/time-sensitive of a task is it going to be to incorporate OpenCL? With that flag, the gpu does get some kind of signal because utilization increases.

Just wondering if it’s a very involved process, or if we just need to copy/paste something from llama and/or ggml?

daniandtheweb · 2023-09-05T23:07:36Z

I'm no expert in opencl but it will require some time, it's not just a copy/paste. The good news is that with the current ram usage the gpu acceleration will probably be one of the more memory efficient.

Happenedtostumblein · 2023-09-05T23:33:16Z

@daniandtheweb Can you tell me broadly speaking what tasks need to be completed, like I’m a 5?

Maybe CodeLlama can help me contribute a pull request to get it done, but I need a thread to grab onto.
(Not sure if tagging is necessary, new to Github)

daniandtheweb · 2023-09-06T13:39:50Z

As I told you I don't know a lot about how the OpenCL implementation works but you probably have to implement each computing kernel of the stock cpu code in opencl. You can take a look at llama.cpp's implementation but you will need to make lots of tweaks to the code to make it work with this project.

Happenedtostumblein · 2023-09-06T14:00:46Z

No problem, hold my beer.

<<only really knows python>>

FNsi · 2023-09-24T00:44:05Z

Try this patch: ggerganov/llama.cpp@6460f75

I can confirm that really work!

rayrayraykk · 2023-11-10T02:29:55Z

@leejet @Green-Sky @ggerganov

I do not know cpp and do not have a solid grasp on how ggml works. , but building the repo with cmake -dggml_clblast=ON seems to work as the GPU utilization goes up and it’s very fast (10s vs 80s per step on a higher end CPU). It does complete all the steps and completes sampling too, but then crashes at line 1505 of ggml-opencl.

If it is a matter of spending time to make this work, is it simple enough for one of you to explain what needs to be done? If so, would be happy to give it a shot but don’t know where to start.

My limited understanding is that sampling is what takes all the effort, so is there a way to maybe switch from GPU to CPU to save the file? Or am I missing some context/knowledge?

Edit: Fixed typo. Flag used is clblast, not openblas.

Use OpenCL on Android, and it gets slower. What device are you using?

superkuh · 2023-12-26T04:10:35Z

I applied the patch and then added some ifdef SD_USE_CLBLAST include "ggml-opencl.h" ... etc, edited cmakelist file with bits from llama.cpp's clblast ported over and renamed/re-pointed, then configured with cmake .. -DGGML_OPENBLAS=ON -DGGML_CLBLAST=ON. Now compiled ./sd recognizes my AMD RX 580 GPU and I get about a 30% speed up. Not a huge increase since that's the same number of CPU threads + GPU, but my GPU is pretty old too. And it does seem take some load off CPU which is nice. Thanks!

Happenedtostumblein mentioned this issue Sep 5, 2023

Enabling cuda GPU acceleration #6

Open

Happenedtostumblein mentioned this issue Sep 6, 2023

[User] GGML_ASSERT failure for opencl ggerganov/llama.cpp#3002

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenCL seems to almost work #48

OpenCL seems to almost work #48

Happenedtostumblein commented Sep 5, 2023 •

edited

Loading

ggerganov commented Sep 5, 2023

Happenedtostumblein commented Sep 5, 2023 •

edited

Loading

Happenedtostumblein commented Sep 5, 2023

daniandtheweb commented Sep 5, 2023 •

edited

Loading

Happenedtostumblein commented Sep 5, 2023

daniandtheweb commented Sep 5, 2023

Happenedtostumblein commented Sep 5, 2023

daniandtheweb commented Sep 6, 2023

Happenedtostumblein commented Sep 6, 2023 •

edited

Loading

FNsi commented Sep 24, 2023

rayrayraykk commented Nov 10, 2023 •

edited

Loading

superkuh commented Dec 26, 2023 •

edited

Loading

OpenCL seems to almost work #48

OpenCL seems to almost work #48

Comments

Happenedtostumblein commented Sep 5, 2023 • edited Loading

ggerganov commented Sep 5, 2023

Happenedtostumblein commented Sep 5, 2023 • edited Loading

Happenedtostumblein commented Sep 5, 2023

daniandtheweb commented Sep 5, 2023 • edited Loading

Happenedtostumblein commented Sep 5, 2023

daniandtheweb commented Sep 5, 2023

Happenedtostumblein commented Sep 5, 2023

daniandtheweb commented Sep 6, 2023

Happenedtostumblein commented Sep 6, 2023 • edited Loading

FNsi commented Sep 24, 2023

rayrayraykk commented Nov 10, 2023 • edited Loading

superkuh commented Dec 26, 2023 • edited Loading

Happenedtostumblein commented Sep 5, 2023 •

edited

Loading

Happenedtostumblein commented Sep 5, 2023 •

edited

Loading

daniandtheweb commented Sep 5, 2023 •

edited

Loading

Happenedtostumblein commented Sep 6, 2023 •

edited

Loading

rayrayraykk commented Nov 10, 2023 •

edited

Loading

superkuh commented Dec 26, 2023 •

edited

Loading