Add max_context_length to TextEncode node for LLM max tokens - experimental use #289
+1,210
−388
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add max_context_length to TextEncode node for LLM max tokens - experimental use
This PR adds
max_context_length
as a parameter (default of 256 as before) to enable additional tokens to be used for the LLM input prompt + prompt template + special tokens.max_context_length
is now a node input with default of 256: You can set it in theDownloadAndLoadHyVideoTextEncoder
node.max_context_length
:max_context_length
less than or equal to the crop_start value.Testing Performed:
max_context_length
= 256, Long Prompt: Works as it did before, prompt gets truncated, no errors.max_context_length
= 5, Long Prompt: Used to error out, now it runs with a warning and truncates.max_context_length
= 512, Long Prompt > 512 tokens: Works, prompt gets truncated (with a log message).max_context_length
= 2048, Long Prompt of ~1700 tokens: Works, prompt goes through fully.In short: This makes the text encoder more flexible and easier to experiment with for longer input prompts and opens the door for few-shot examples. It's highly likely (with near certainty) that the model was trained with shorter inputs, likely right around the 256 default it was before. That said, autoregressive LLMs are always surprising, and it's very possible to end up in the same embedding space with a longer prompt with examples as you would have with a shorter one. Reminder changing this from the 256 default is experimental - it's locked at the max context window from the model config (8192) to avoid accidental way-too-high ones. It might be unstable with larger values, so test with caution. VRAM usage will increase if you increase it - only for a short burst, but if you're right on the edge of maxing out your VRAM, leave a little room before changing and only incrementally increase. Hopefully will see some cool possibilities from this. Thanks to Kijai as always for making a fantastic project for us all to experiment with.
edit: removed token counting due to time constrains / complexity + bug with custom prompt templates.