Save/Load Just One Sequence

# Feature Description

Would it be possible to create functions that looked something like this:
 - `llama_kv_save_seq(struct llama_context * ctx, llama_seq_id seq_id, uint8_t * dst);`
 - `llama_kv_load_seq(struct llama_context * ctx, llama_seq_id seq_id, uint8_t * src);`

# Motivation

In llama.cpp it is possible to save and load the _entire_ context state in one operation with `llama_copy_state_data` and `llama_set_state_data`. For example this could be used to evaluate a large system prompt once, save it to disk, and then load the state every time a new conversation is started.

However with the batch decoding this isn't really possible. If you have many sequences being evaluated at once you can only load and save them _all_ simultaneously.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Save/Load Just One Sequence #5843

Feature Description

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Save/Load Just One Sequence #5843

Description

Feature Description

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions