Skip to content

Support KV cache quantization, especially for vLLM inference #920

@xin3he

Description

@xin3he

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions