Skip to content

Support for Condensed RotaryEmbeddings #333

Closed
@winglian

Description

@winglian

looks like there are various methods of extending context length, suck as superhot, ntk-aware, and condensed rope. This request is to track support for condensed rotaryembeddings as it seems to have the best performance at long contexts atm.

https://lmsys.org/blog/2023-06-29-longchat/
https://github.com/lm-sys/FastChat/blob/3f0c6e54498e179098ead9a596929e23327ad75c/fastchat/model/llama_condense_monkey_patch.py#L68

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions