Replies: 2 comments 4 replies
-
Context shifting is handled automatically by For example, if you have a context size of 100K tokens and all of it gets filled with chat history, I have tests for this implementation that seem to pass, but just to make sure, I've tested it again manually and it appears to work as expected. |
Beta Was this translation helpful? Give feedback.
-
Maybe there's an issue with small context sizes then. If you use a context of 1024 and try and pass in around 1200 tokens worth of context, it'll hang for, as I mentioned, an indefinite amount of time. I've waited as long as 30 minutes to see if it was just slow. I tested with different batch sizes to see if that was the issue but didn't have any luck there. Is there some reason for there to be a minimum context size for this to work? |
Beta Was this translation helpful? Give feedback.
-
From this page in the documentation I get the impression that chat history is automatically truncated to fit the contextSize when using LlamaChat or LlamaChatSession. However when I try to add more messages than fit in the current contextSize, it seems to delay indefinitely. I'm not sure if that's a bug or if the user is meant to handle truncating chat messages to not exceed the contextSize.
Beta Was this translation helpful? Give feedback.
All reactions