Skip to content

feature: support for exllama and AutoGPTQ #796

Closed
@mudler

Description

@mudler

Discussed in #763

Originally posted by yarray July 17, 2023
Although llama.cpp can now support GPU via cublas, it seems that exllama runs times faster if with a good enough GPU (3090 as an example). Is there any plan to support exllama, or in general, other loaders to load LLM?

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions