Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Option to Evaluate Dynamic Prompts #1694

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nolan778
Copy link

Feature: Dynamic Prompts

  • Add support for dynamic prompt evaluation, enabled/disabled through a new option on the Interface UI settings page (disabled by default).
  • This feature may not exactly fit with the main use case of this plugin (live editing with AI), so I have disabled it by default, but I find myself using this plugin primarily for image generation, due to its ease of user over comfyUI. Having a robust way to easily generate varied prompts within Krita was important to me.
  • Currently only supports random sampling on variants (in-line), wildcards (.txt files), and variables, with support for various nesting combinations of all three.

Additions

  • New UI Interface setting to enable/disable (off by default)
  • When enabled, both positive and negative prompts (after being merged with style prompt text) are individually parsed for dynamic prompt template syntax (variants, variables, wildcards).
  • If any template syntax is found, it is evaluated by new functions in the text module and the template syntax is replaced by the intended or randomly chosen values.
  • See sd-dynamic-prompts Basic Usage and sd-dynamic-prompts Template Syntax for information on proper syntax and capabilities. I followed this example as closely as possible, including the glob file pattern matching notation.
  • Wildcard directories are:
    • util.plugin_dir / "wildcards" (lowest priority, missing by default, useful if you want to ship any wildcards with the plugin)
    • util.user_data_dir / "wildcards" (higher priority, missing by default, user would create and add files or directory structures of files)
    • custom directories (highest priority, not currently used but could be populated via a future UI element, currently just used for unit testing)
  • The dynamic prompt text evaluation functions are thoroughly unit-tested. Wildcard files used for testing are auto-generated at unit test runtime into tests/data/wildcards and that directory is cleaned up automatically at test completion.

Limitations:

  • Job Info on the images in the history does not contain the actual final evaluated prompt that was sent to comfyUI in the CLIPTextEncode element of the workflow. I was unsure how to get this information. Also, it isn't that straightforward since the evaluation takes place after the style and user prompt are merged and currently the job info only displays the user prompt. If the style prompt has any use of template syntax, that wouldn't be visible anyway. Not sure what to do with this, but it would be nice to know what actual prompt was used to generate an image that you want to refine or recreate.
  • Similarly, even with a fixed seed, the generated image will be different on each generation due to random selection replacement of syntax. This could actually be considered a feature.
  • No json or yaml wildcard file support. I could not find any documentation on how the other dynamic prompt plugins for comfy, a1111, or forge handle these files with their nested structure. Majority of use cases I have seen are using txt files anyway.
  • Cyclic or Combinatorial sampling would require state to persist between job queues, more UI elements, and even require automatic execution of batch queues. This seemed out of scope for a simple feature and was not of particular interest to me.

@nolan778
Copy link
Author

nolan778 commented Apr 1, 2025

In my rush to create this feature, I missed the existence of the actual dynamicprompts project (MIT licensed) that drives the parsing of the other various plugins and instead recreated my own version from scratch. Here is the handling of structured wildcard files (json and yaml). I'm sure my own implementation is not as robust as this purpose-built library, so I will need to investigate when I get more time.

@Acly
Copy link
Owner

Acly commented Apr 1, 2025

It's generally not possible to have dependencies for the plugin - it runs inside Krita's embedded python. I also feel like it's perhaps a bit overkill (or a lot) - I can see the usefulness of {a|b|c} and wildcard files, but not sure how many people actually need a template language with variable substitution etc.

Do you use all of it? Or can there be a "90% of the value with 10% of the code" kind of deal?

@Acly
Copy link
Owner

Acly commented Apr 1, 2025

Regarding where substitution happens, these are the stages:

  1. model.py - collect all the data from the workspace
  2. workflow.py::prepare - client-side processing of the data, results in a WorkflowInput
  3. workflow.py::create - creates a comfyui prompt from WorkflowInput <- currently dynamic prompt eval happens here

Note that WorkflowInput must contain all data which influences the workflow, and step 3. should not access globals (like settings) or the file system. That part will run on a different machine in cloud scenario.

Dynamic prompt evaluation could be applied in step 2 - this would also allow adding the result to job metadata.

  • That's before merging style prompt, but I don't think having dynamic style prompt is useful/important?
  • Needs to evaluate dynamic prompt for all regions

@nolan778
Copy link
Author

nolan778 commented Apr 1, 2025

  • Getting rid of variables would certainly cut down the code and headache, but unfortunately they are very useful, especially immediate evaluation variables that use the "!" symbol to force immediate evaluation so that the same randomly chosen word or phrase can be used in multiple places in your prompt. Non immediate variables are also useful to shorten the visible length of the prompt when you don't want to list the same variant list or wildcard file multiple times.
  • For the rest, I wanted the available syntax to be mostly the same to what people expect from the existing dynamic prompt plugins and they wouldn't have to change their saved prompts or modify their wildcard files to remove nested variables, variants, and references to other wildcard files. I didn't see a major performance issue, but this plugin isn't trying to evaluate the prompt dozens of times per second and it is probably best to have this off for live mode anyway.
  • if it helps, the dynamic prompt code could be moved to its own module to avoid cluttering the smaller text module.
  • I was unaware about needing to do this in prepare() instead, but that makes sense and I will move the call(s) there. I agree that we can probably go without a dynamic style prompt or a style prompt that influences the user prompt and vice versa. Probably would not want that anyway, in case the user turned off dynamic prompts option and forgot to remove the syntax from their chosen style. Four questions through:
    • The positive prompt is modified a few times. Where exactly in that function should I put it?
    • Why is the negative prompt merged immediately with the style negative, but the positive prompt is not?
    • I never learned what regions are or how to use them. Should they share influence with the main prompt? Would someone put a variable in the main prompt and use it in the regional prompt? Or should the prompts be isolated and evaluated separately? Sharing influence would really complicate it and require multiple calls to all prompts to iterate and pass a variable dictionary back and forth until all prompts stabilized. Probably not what we want to do.
    • Should negative prompts even use dynamic prompts? I'm thinking no. I don't see the use case. These days I pretty much only have a fixed list of negative tokens per model type and models don't seem to honor specific negative tokens related to concepts that they are dead set on depicting anyway.

@Acly
Copy link
Owner

Acly commented Apr 1, 2025

The positive prompt is modified a few times. Where exactly in that function should I put it?

I think it's just handling of <lora:...> tags inside the prompt. They shouldn't clash with dynamic prompts, so order doesn't matter, but I'd do dynamic prompts afterwards.

Why is the negative prompt merged immediately with the style negative, but the positive prompt is not?

In some cases positive prompt can affect only parts of the image while style prompt affects the whole image.

I never learned what regions are or how to use them. Should they share influence with the main prompt?

Regions are just prompts that affect certain part of the image. No I don't think state needs to be shared between region prompts.

Should negative prompts even use dynamic prompts?

Personally I don't see the point, negative prompts have limited use. Doesn't stop some people from going ham on them though...

@nolan778
Copy link
Author

nolan778 commented Apr 1, 2025

Okay, moving it to prepare() works as you said, but the job info is still showing the full prompt with template syntax instead of the final evaluated text. What would I need to modify to capture it properly in the job history?

EDIT: Found this issue. model.py stores the positive prompt prior to prepare() and uses that stored value for the job info. Should I do this in model.py instead?

EDIT2:

  • There are a lot of complex code interactions between regions and the root prompt that I don't quite understand and there is also the complexity of Generate/Refine vs Live, Upscale, Animation, etc., so I'm not sure model is even the right place. It looks like top of process_regions in regions.py is actually the right place to do it, modifying the root and region prompts before they every interact, but that has its own problem in that it seems to replace my dynamic prompt syntax in the user prompt window automatically with the evaluated text after I click Generate.
  • I'm just stumbling in the dark here, but maybe this kind of feature needs a bit more infrastructure to handle this intuitively? For instance, a way to switch back and forth between the original dynamic prompt and the evaluated prompt so that generate could use either the non-evaluated full prompt or the evaluated prompt of the last applied top layer for creation of variations using the same prompt or refinement. Similarly, the Live mode should probably never use the original prompt and should use the evaluated prompt of the layer you are refining. I'm sure it's not that simple as I don't use all the features of the plugin and can't see the edge cases.

@nolan778 nolan778 force-pushed the add_dynamic_prompts branch from 2f3d56e to 18678be Compare April 2, 2025 11:00
@nolan778
Copy link
Author

nolan778 commented Apr 2, 2025

Okay, I think I finally got it working well in process_regions. I added another string member to the Region and RootRegion classes to avoid modifying the original positive property directly and forcing a UI update with the evaluated text. Then in process_regions, I evaluate the root and all regions and I use the evaluated positive property instead, making sure to clear the evaluation before returning from the function, so that the next generation starts from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants