Skip to content

cuda : add f32 to bf16 copy op #1182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

CISC
Copy link
Contributor

@CISC CISC commented Apr 7, 2025

This allows BF16 KV-cache on CUDA.

Full disclosure: I noticed this in ik_llama.cpp repo, but this is not an upstream, it was a simple feature to add.

@CISC
Copy link
Contributor Author

CISC commented Apr 7, 2025

Actually, I see this will conflict with llama.cpp changes just made, will move this PR there instead.

@CISC CISC closed this Apr 7, 2025
@CISC CISC deleted the cuda-bf16-kv-cache branch April 7, 2025 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant