clip : offload to GPU

With the recent support for running convolutions on the GPU (#4060) we should be able to offload CLIP to run fully on the GPU.

https://github.com/ggerganov/llama.cpp/blob/3d68f364f15778dc326f5024f2e5af1ad6dfddef/examples/llava/clip.cpp#L231-L236

- Implement `ggml_acc` CUDA / Metal kernels
- Avoid `ggml_repeat` where possible using broadcast
- Should use the new `ggml-backend` API (see https://github.com/ggerganov/ggml/blob/master/examples/gpt-2/main-backend.cpp)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

clip : offload to GPU #4061

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

clip : offload to GPU #4061

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions