Conversation
| "sd_textencoder_cache_size": OptionInfo(4, "Text encoder cache size", gr.Slider, {"minimum": 0, "maximum": 16, "step": 1}), | ||
| "sd_textencder_linebreak": OptionInfo(True, "Use line break as prompt segment marker", gr.Checkbox), | ||
| "diffusers_zeros_prompt_pad": OptionInfo(False, "Use zeros for prompt padding", gr.Checkbox), | ||
| "prompt_detailed_tokens": OptionInfo(False, "Show detailed token counts", gr.Checkbox), |
There was a problem hiding this comment.
i don't think we need this as tunable. if it works, it should be on and thats it.
|
|
||
| try: | ||
| try: | ||
| ids = getattr(tokenizer(prompt_list), 'input_ids', []) |
There was a problem hiding this comment.
this assumes that tokenizer works with list as input, that's the only thing i'm not sure about for all different tokenizers, but ok for now and if needed, it can be changed later.
| for p in prompt_list: | ||
| ids.append(getattr(tokenizer(p), 'input_ids', [])) | ||
| except Exception as e: | ||
| shared.log.warning("Token counter:", e) |
There was a problem hiding this comment.
this should be guarded with warn_once so its not flooding the log?
(similar to how preview reports first error but then stays silent)
|
minor nitpicks in comments. see check in sdnext/modules/processing_prompt.py Lines 85 to 90 in c9e21a5 |
|
Yeah, it's a little confusing to untangle which instances are valid for |
|
any case where prompt parser is not native (so either fixed, a1111 or compel), do not trigger your logic. lets not go into total border cases. sdnext has native and native is what matters. what i highlighted is that |
- I'm assuming there won't be much of a difference in performance. If it ends up being too slow, this can always be reverted.
This reverts commit e5fdb00.
|
Reverted using only manual looping because I found the definition of valid inputs according to def _is_valid_text_input(t):
if isinstance(t, str):
# Strings are fine
return True
elif isinstance(t, (list, tuple)):
# List are fine as long as they are...
if len(t) == 0:
# ... empty
return True
elif isinstance(t[0], str):
# ... list of strings
return True
elif isinstance(t[0], (list, tuple)):
# ... list with an empty list or with a list of strings
return len(t[0]) == 0 or isinstance(t[0][0], str)
else:
return False
else:
return False |
BREAKand, if enabled, line break).PR created as draft to await feedback regarding other tokenizers (I don't have any models set up that use different tokenizers, or available free space at the moment).