Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : KV cache view API + better KV cache management #4170

Merged
merged 4 commits into from
Nov 23, 2023
Merged

Commits on Nov 22, 2023

  1. Configuration menu
    Copy the full SHA
    79cb8f0 View commit details
    Browse the repository at this point in the history
  2. llama : zero KV cache used upon clear

    ggml-ci
    ggerganov committed Nov 22, 2023
    Configuration menu
    Copy the full SHA
    671f639 View commit details
    Browse the repository at this point in the history

Commits on Nov 23, 2023

  1. llama : allow exporting a view of the KV cache (#4180)

    * Allow exporting a view of the KV cache
    
    * Allow dumping the sequences per cell in common
    
    * Track max contiguous cells value and position as well
    
    * Fix max contiguous empty cells index calculation
    
    Make dump functions deal with lengths or sequences counts > 10 better
    
    * Fix off by one error in dump_kv_cache_view
    
    * Add doc comments for KV cache view functions
    
    Eliminate cell sequence struct; use llama_seq_id directly
    
    Minor cleanups
    KerfuffleV2 committed Nov 23, 2023
    Configuration menu
    Copy the full SHA
    5df7d06 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f8e9f11 View commit details
    Browse the repository at this point in the history