Explicit Cache documents with reference ids? #15642

yupbank · 2025-03-27T21:02:55Z

yupbank
Mar 27, 2025

I have a unique setup with a large document, many user might have questions for it.

Currently, i'm concatenating the document || question as prompt for each request. Which do benefit from prefix cache, but still requires some amount of overhead. i wonder if it's possible to have cache_reference_to_document || question support, so it's more explicit and reduce overheads

DarkLight1337 · 2025-04-03T09:46:04Z

DarkLight1337
Apr 3, 2025
Collaborator

No, this is not supported unless you modify vLLM code

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit Cache documents with reference ids? #15642

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Explicit Cache documents with reference ids? #15642

yupbank Mar 27, 2025

Replies: 1 comment

DarkLight1337 Apr 3, 2025 Collaborator

yupbank
Mar 27, 2025

DarkLight1337
Apr 3, 2025
Collaborator