Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REFRACTOR] Remove Backend.CUDA and Backend.CUDA_OLD #165

Merged

Conversation

ZX-ModelCloud
Copy link
Contributor

No description provided.

@Qubitium Qubitium changed the title remove Backend.CUDA and Backend.CUDA_OLD [REFRACTOR] Remove Backend.CUDA and Backend.CUDA_OLD Jul 4, 2024
@Qubitium Qubitium marked this pull request as ready for review July 4, 2024 08:08
@Qubitium
Copy link
Contributor

Qubitium commented Jul 4, 2024

Both cuda/cuda-old have no good working case in July 2024. They perform much worse than exllama and exllama v2 kernels. The only saving grace is more bits supported but 99.9% of the cases will be using 4bit. We will not spend the time to support compat for 4 kernels when we can just pick the fastest 2.

@Qubitium Qubitium merged commit 6f1eb58 into ModelCloud:main Jul 4, 2024
DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024
* remove Backend.CUDA and Backend.CUDA_OLD

* fix unit test

* remove cuda_64/ and cuda_256/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants