Closed
Description
I am running mpt-7b-instruct which should support "arbitrary long contexts". However when I provide the model long inputs I get this warning (and the model does not produce any output):
Token indices sequence length is longer than the specified maximum sequence length for this model (3622 > 2048). Running this sequence through the model will result in indexing errors
WARNING 07-12 15:03:35 scheduler.py:194] Input prompt (3622 tokens) is too long and exceeds limit of 2560
What is this scheduler limit? My understanding is that it is not related to the model. How can I change it?