It would be great, if you can add support for longchat(from Fastchat) models, which have 16k context length: https://github.com/lm-sys/FastChat https://github.com/lm-sys/FastChat/blob/6d06351542bc0c3701d54619e6df4c26aa91a260/fastchat/model/llama_condense_monkey_patch.py#L10C18-L10C18