Skip to content

Save/Load Just One Sequence #5843

Closed
Closed
@martindevans

Description

@martindevans

Feature Description

Would it be possible to create functions that looked something like this:

  • llama_kv_save_seq(struct llama_context * ctx, llama_seq_id seq_id, uint8_t * dst);
  • llama_kv_load_seq(struct llama_context * ctx, llama_seq_id seq_id, uint8_t * src);

Motivation

In llama.cpp it is possible to save and load the entire context state in one operation with llama_copy_state_data and llama_set_state_data. For example this could be used to evaluate a large system prompt once, save it to disk, and then load the state every time a new conversation is started.

However with the batch decoding this isn't really possible. If you have many sequences being evaluated at once you can only load and save them all simultaneously.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions