Skip to content

Add Self-Extend support? #1242

Open
Open
@theaerotoad

Description

@theaerotoad

I've been really enjoying using both llama.cpp-python and the original llama.cpp. These are amazing developments here, especially for folks without massively powerful GPUs.

There's a really nice feature that was implemented in llama.cpp in January to allow self-extend (ala LongLLM's approach)). It works well for the llama's main.cpp as well as server.cpp. It works really well, and plenty of folks have noted self-extend is especially useful with Mistral/Mixtral, Gemma, and Phi 2.

It appears someone else might have been asking about this earlier here. Right now, I'm having to move in and out of python when I want to run summarization on a 'just-slightly-too-long' article with self-extend. Would you consider implementing self-extend as an option in llama.cpp-python?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions