Support for Condensed RotaryEmbeddings

looks like there are various methods of extending context length, suck as superhot, ntk-aware, and condensed rope. This request is to track support for condensed rotaryembeddings as it seems to have the best performance at long contexts atm.

https://lmsys.org/blog/2023-06-29-longchat/
https://github.com/lm-sys/FastChat/blob/3f0c6e54498e179098ead9a596929e23327ad75c/fastchat/model/llama_condense_monkey_patch.py#L68

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support for Condensed RotaryEmbeddings #333

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support for Condensed RotaryEmbeddings #333

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions