Add optional detailed token counts by awsr · Pull Request #4657 · vladmandic/sdnext

awsr · 2026-02-20T11:32:26Z

Add option to also show token counts split up by segment markers (BREAK and, if enabled, line break).
- Detailed display format is `[#, #, #] {sum}/{max_length}"
Minor changes:
- Change "tokenizer busy" state display to "--/--"
- Add error state display "??/{max_length}"

PR created as draft to await feedback regarding other tokenizers (I don't have any models set up that use different tokenizers, or available free space at the moment).

vladmandic · 2026-02-20T19:13:27Z

modules/shared.py

    "sd_textencoder_cache_size": OptionInfo(4, "Text encoder cache size", gr.Slider, {"minimum": 0, "maximum": 16, "step": 1}),
    "sd_textencder_linebreak": OptionInfo(True, "Use line break as prompt segment marker", gr.Checkbox),
    "diffusers_zeros_prompt_pad": OptionInfo(False, "Use zeros for prompt padding", gr.Checkbox),
+    "prompt_detailed_tokens": OptionInfo(False, "Show detailed token counts", gr.Checkbox),


i don't think we need this as tunable. if it works, it should be on and thats it.

vladmandic · 2026-02-20T19:16:10Z

modules/ui_common.py

+
+        try:
+            try:
+                ids = getattr(tokenizer(prompt_list), 'input_ids', [])


this assumes that tokenizer works with list as input, that's the only thing i'm not sure about for all different tokenizers, but ok for now and if needed, it can be changed later.

vladmandic · 2026-02-20T19:17:11Z

modules/ui_common.py

+                for p in prompt_list:
+                    ids.append(getattr(tokenizer(p), 'input_ids', []))
+        except Exception as e:
+            shared.log.warning("Token counter:", e)


this should be guarded with warn_once so its not flooding the log?
(similar to how preview reports first error but then stays silent)

vladmandic · 2026-02-20T19:18:15Z

minor nitpicks in comments.
big one is that entire concept with BREAK only works on some models and only if enhanced prompt parsing is enabled, so its a mistake to present broken down token count if that is not the case.

see check in

sdnext/modules/processing_prompt.py

Lines 85 to 90 in c9e21a5

    
           if (prompt_attention != 'fixed') and ('Onnx' not in cls) and ('prompt' not in p.task_args) and ( 
        
               ('StableDiffusion' in cls) or 
        
               ('StableCascade' in cls) or 
        
               ('Flux' in cls and 'Flux2' not in cls) or 
        
               ('Chroma' in cls) or 
        
               ('HiDreamImagePipeline' in cls)

awsr · 2026-02-22T22:35:31Z

Yeah, it's a little confusing to untangle which instances are valid for BREAK because processing_prompt.py makes calls to prompt_parser_diffusers.py which also makes calls to prompt_parser.py. For example, parse_prompt_attention in prompt_parser.py includes a check to replace instances of \n with BREAK based on the setting. I've got no idea if that's because other parsers automatically do that or if BREAK doesn't work with the other parsers. I'm still looking into it.

vladmandic · 2026-02-23T05:19:13Z

any case where prompt parser is not native (so either fixed, a1111 or compel), do not trigger your logic. lets not go into total border cases. sdnext has native and native is what matters.

what i highlighted is that BREAK only has meaning for some MODEL TYPES, not just if parser is set to native or not.

- I'm assuming there won't be much of a difference in performance. If it ends up being too slow, this can always be reverted.

This reverts commit e5fdb00.

awsr · 2026-02-28T04:21:19Z

Reverted using only manual looping because I found the definition of valid inputs according to tokenizers (and, by extension, transformers):

        def _is_valid_text_input(t):
            if isinstance(t, str):
                # Strings are fine
                return True
            elif isinstance(t, (list, tuple)):
                # List are fine as long as they are...
                if len(t) == 0:
                    # ... empty
                    return True
                elif isinstance(t[0], str):
                    # ... list of strings
                    return True
                elif isinstance(t[0], (list, tuple)):
                    # ... list with an empty list or with a list of strings
                    return len(t[0]) == 0 or isinstance(t[0][0], str)
                else:
                    return False
            else:
                return False

awsr added 2 commits February 20, 2026 02:27

Add optional detailed token counts

7028018

Minor cleanup

2ad95ed

vladmandic reviewed Feb 20, 2026

View reviewed changes

awsr added 2 commits February 27, 2026 18:06

Use manual looping

e5fdb00

- I'm assuming there won't be much of a difference in performance. If it ends up being too slow, this can always be reverted.

Revert "Use manual looping"

6acae60

This reverts commit e5fdb00.

Only enable detailed with native prompt parsing

8655913

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add optional detailed token counts#4657

Add optional detailed token counts#4657
awsr wants to merge 5 commits intovladmandic:devfrom
awsr:detailed-token-counts

awsr commented Feb 20, 2026

Uh oh!

vladmandic Feb 20, 2026

Uh oh!

vladmandic Feb 20, 2026

Uh oh!

vladmandic Feb 20, 2026 •

edited

Loading

Uh oh!

vladmandic commented Feb 20, 2026 •

edited

Loading

Uh oh!

awsr commented Feb 22, 2026

Uh oh!

vladmandic commented Feb 23, 2026

Uh oh!

awsr commented Feb 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

awsr commented Feb 20, 2026

Uh oh!

vladmandic Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

vladmandic Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

vladmandic Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vladmandic commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

awsr commented Feb 22, 2026

Uh oh!

vladmandic commented Feb 23, 2026

Uh oh!

awsr commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vladmandic Feb 20, 2026 •

edited

Loading

vladmandic commented Feb 20, 2026 •

edited

Loading

awsr commented Feb 28, 2026 •

edited

Loading