additional options for image generation #1765

wbruna · 2025-09-30T01:43:22Z

The first commit adds support for the "default" sampler, letting sd.cpp choose according to the model (Euler for SD3 / Flux / Chroma, Euler A otherwise).

The other two are more of an experiment: they add support for specifying flags through the prompt string, using the Perchance syntax (PARAMETER:::VALUE). This way, we don't need specific support from the web interface to test them. I added support for the scheduler and timestep-shift flags, enough to make NitroFusion models work (with (timestep-shift:::250) for NitroSD Realism, (timestep-shift:::400) for Vibrant, and (scheduler:::sgm_uniform) for both).

That syntax works well because it can't really happen in a normal prompt without resulting in an incorrect prompt weighing. Another option could be following the LoRA general syntax, with <param:PARAMETER:VALUE> or something like that.

LostRuins · 2025-09-30T02:22:22Z

For 1, I'm fine with this but I think that Euler actually makes a good default in all cases - I have noted instability with Euler A sampling even in SDXL models, is there ANY model that does not work well with Euler? If not I think we can just use that as a default. In fact I think that's the default for the webUi and the API if unspecified too. I'm ok with either.

For 2, not really a fan of this approach. The prompt field should be considered "sandboxed" and not be overloadable with anything that affects generation behavior (indeed that's partially why I didn't want the lora switching there either). If you need addition gen-time parameters, I think it's best to add them to the generation API (which will be well known)

wbruna · 2025-09-30T12:12:53Z

For 1, I'm fine with this but I think that Euler actually makes a good default in all cases - I have noted instability with Euler A sampling even in SDXL models, is there ANY model that does not work well with Euler?

I don't think it's a bad default. I only know about an admittedly very obscure corner case: SDXL DMD2 models (which are intended to be used with LCM) usually work OK with Euler A, but tend to make noisy outputs with Euler.

Aside: in that same line, I miss a way to reset the interface to "known to be good" default parameters: say, choosing "LCM, 10 steps, CFG 1" for LCM models. But that's a bit too much front-end development for me...

If not I think we can just use that as a default. In fact I think that's the default for the webUi and the API if unspecified too. I'm ok with either.

The backend's default itself is Euler A right now (forcibly changed to Euler for SD3/Flux); I don't remember if koboldcpp.py sets its own default.

My main argument would be from a least-surprises point of view: if the user know the "default" is good for mainline sd.cpp, that same string (or not specifying it) will give them similar results on Koboldcpp. The same for development on our side: "default" would simply mean "sd.cpp's responsibility" (like it is right now with the default scheduler), even if a new model comes up with completely different parameters for "good" results. So a "library default" would be a bit more future-proof (eg. for Qwen Image).

The same argument suggests we shouldn't forcibly change Euler A to Euler, because users see "Euler A" on the interface, and could then learn that "Euler A is OK for Flux". So, with either Euler or "sd.cpp default" as a default sampler, I think it's best to disable that forceful change; I can add a commit to this PR to do that.

For 2, not really a fan of this approach. The prompt field should be considered "sandboxed" and not be overloadable with anything that affects generation behavior (indeed that's partially why I didn't want the lora switching there either). If you need addition gen-time parameters, I think it's best to add them to the generation API (which will be well known)

Fair enough. I don't disagree the prompt isn't really the best place to do it. It's just very convenient 🙂

Main motivation right now: NitroFusion models are very fast (under 2s on my card) , and give much better results than other 1-step models like SDXL Turbo, so they'd be ideal for scene generation in an adventure chat. But they need a different scheduler (like sgm_uniform) and an additional parameter (timestep-shift), which right now can't be specified either at model loading time or on the chat's media settings tab. And the prompt is already there...

The scheduler arguably goes hand-in-hand with the sampler, and could even be specified in the same field if we managed to keep the number of combinations under control ("Euler A sgm_uniform", "Heun SmoothStep", "Euler Karras", etc). But timestep-shift is both obligatory for NitroFusion models (and the same value doesn't work for different models), and not really needed in most cases (although it can be specified for non-NitroFusion models, with interesting results). So... how do we specify it on the web interface without making it cluttered for all other uses?

Would you consider OK a specialized "parameter prompt", which only receives white-listed key-value pairs like the code here? I don't care much about the syntax (could be like this, or JSON, "KEY:VALUE ", even command-line like), but the whole add-a-separate-control-for-each-field simply doesn't scale.

LostRuins · 2025-10-01T08:11:18Z

Would you consider OK a specialized "parameter prompt", which only receives white-listed key-value pairs like the code here?

Yeah, if needed I suppose adding an extra "meta" field would work for that. Though, did you need this in the UI as well? Or only in the API?

Also, would it be possible to simply hard code good hardcoded values for each architecture type? (if possible that would be the easiest option)

wbruna · 2025-10-01T10:37:03Z

Would you consider OK a specialized "parameter prompt", which only receives white-listed key-value pairs like the code here?

Yeah, if needed I suppose adding an extra "meta" field would work for that. Though, did you need this in the UI as well? Or only in the API?

It's almost only useful at the UI level. If we had another way of specifying arbitrary key+value fields on the UI, we could simply make it set different API fields.

The only reason for API-level support would be to help simplifying the UI for that implementation.

Also, would it be possible to simply hard code good hardcoded values for each architecture type? (if possible that would be the easiest option)

Sometimes. In the NitroFusion case, for instance, the scheduler doesn't matter that much, as long as you choose from a specific subset; in principle, we could just hardcode "sgm_uniform". Problem is, they look like normal SDXL models, so we'd have to find a way to identify them. The timestep-shift value is even worse, since that seemingly depends only on model training. Setting defaults at loading time would work... but then we get the same "UI fields" problem, on Tk 🙂

wbruna · 2025-10-01T11:37:19Z

Just digging a little bit deeper onto another point:

The prompt field should be considered "sandboxed" and not be overloadable with anything that affects generation behavior (indeed that's partially why I didn't want the lora switching there either).

The LoRA support is a bit of a worst offender, since it can load arbitrary (leejet/stable-diffusion.cpp#830) files on the server. But for some other cases, I'm not entirely sure.

For instance, you can specify

high contrast,

at the prompt level. And if the model understood that, something like

(contrast_level:10),

would be even better. And this is not that different from

(cfg_scale:::10),

as long as we ensure there are no "malicious" values (say, a negative or INFINITY causing a server crash - which is something API fields and token weight parsing already need to do), and it can't be used for circumventing parsers or filters (a convincing example is a bit hard here, but someone could make use of the parameter removal to smuggle a forbidden word past a filter, like "NS(cfg_scale:10)FW" ). And since the result will just fill up some specific image generation fields, the worst that could happen would be a bad generation.

Of course, we may not want the user having access to those fields; this should be processed before that restriction is applied. A separate 'meta' prompt field could help here.

LostRuins · 2025-10-01T14:39:19Z

Another reason for avoiding generation config in the prompt is also the input might be from unknown sources or unsanitized. For example, the img prompt could be based off the output of a LLM, or fed forward from some search results. The user might not want such sources affecting gen params (imagine if your LLM could affect step count from the prompt by saying "render 5000 more steps")

wbruna · 2025-10-01T15:47:02Z

I kept the prompt parsing because it's the only way right now to test the new fields, but moved it to the Python level on the last revision, so it can be easily discarded. Note it's not ready for inclusion even if you'd agree to it, since I didn't whitelist the fields; I'll turn it into a proper implementation if/once we add a meta prompt.

I still need to figure out how to properly add them to the API. Another thing: do we have a ready-to-use script or curl prompt to test the API directly?

(by the way, I rebased this onto #1769 just to make it easier to test with Flux and Chroma; that PR should be uncontroversial enough to be included first )

Another reason for avoiding generation config in the prompt is also the input might be from unknown sources or unsanitized. For example, the img prompt could be based off the output of a LLM, or fed forward from some search results. The user might not want such sources affecting gen params (imagine if your LLM could affect step count from the prompt by saying "render 5000 more steps")

I agree parameters that directly affect resource usage (like step count and resolution) should be kept out of it anyway. For the normal prompt, I'd simply whitelist parameters known to only affect image results (cfg_scale and sampler would be borderline according to this, since they could change the step duration).

LostRuins · 2025-10-02T06:45:15Z

If you build koboldcpp, you can test it via the swagger API page docs at http://localhost:5001/api

adding a new field is as simple as reading it inside the sd_generate function, take a look at the most recently added field avi_video for an example. It can be undocumented for now.

I'm a bit confused though, cause you mentioned you wanted (arbitary?) extra field support through a "meta" parameter, but then now you're adding them as individual named items again? What is hack_prompt_options for?

I thought you were going for:

input.prompt = "a happy white cat"
input.negative_prompt = "ugly, deformed"
input.meta = "(timestep-shift:::250)(scheduler:::sgm_uniform)"

whereafter the input.meta field gets parsed C++ side like you did originally.

wbruna · 2025-10-02T10:45:15Z

If you build koboldcpp, you can test it via the swagger API page docs at http://localhost:5001/api

adding a new field is as simple as reading it inside the sd_generate function, take a look at the most recently added field avi_video for an example. It can be undocumented for now.

Thanks, I'll look into it.

I'm a bit confused though, cause you mentioned you wanted (arbitary?) extra field support through a "meta" parameter, but then now you're adding them as individual named items again? What is hack_prompt_options for?

I thought you were going for:
input.prompt = "a happy white cat"
input.negative_prompt = "ugly, deformed"
input.meta = "(timestep-shift:::250)(scheduler:::sgm_uniform)"
whereafter the input.meta field gets parsed C++ side like you did originally.

That syntax (with a renamed timestep-shift -> shifted_timesteps for Python dictionary reasons) is still supported, just applied at the Python level. I've kept it at the main prompt just because a new meta_prompt can't be tested right now.

The move is mainly for keeping the functional part (generation parameters) separate from the user and API facing part (separate fields / prompt / meta prompt), because rebasing and changing the C++ code to adapt that approach is costly (I have to fully rebuild Koboldcpp every time there is a Vulkan shader change).

Separate C++ fields may also make sense if we decide to change the approach: for instance, setting a fixed timestep-shift value at model loading time.

In a nutshell, my targets are:

a good and fast way to generate images in Chroma
a way to use Nitro models on a Koboldcpp adventure chat

The prompt parsing thing is just an approach to reach that: I need some way to change the timestep-shift and scheduler through the Koboldcpp UI, to see the resulting images on the Koboldcpp interface.

We have several separate questions at this point:

are these extra parameters useful enough to be supported?
is it useful to be able to add new parameters without an UI change / explicit UI support?
are some of these parameters useful enough to include as full separate controls at the UI level? (this could be the case for the scheduler, I don't know)

The hack_prompt_options is an attempt to answer those, by allowing us to test them right now, before I learn how to do UI/API changes enough to make it properly available through the UI.

( an aside: from a technical POV (say, if I was designing it from the ground), it doesn't make any sense to cram parameters into a new syntax at the API level: if there's software at the other end, it's simpler and less error-prone to use separate fields. That prompt syntax is a user thing, it should be done at the UI level. But I'm not sure my non-existent Vue/Javascript skills would be enough for that approach )

LostRuins · 2025-10-02T12:19:28Z

Yeah the idea with the single "meta" field is it can, like you said, use a simple text box for it's UI. It's like a parameter garbage bin, you just throw all the stuffing inside and the backend sorts out what it can interpret, ignoring what it cannot, while keeping the main prompt clean

wbruna · 2025-10-02T16:52:47Z

I included an additional config field to set default image generation options, as a simpler way to be able to customize the parameters for the adventure+nitro case. If you think it's a good feature, I can work on polishing it first, and leave the meta_prompt field (UI and all) for later, possibly a follow-up PR.

Edit: to be clearer, my idea is it'd work as a "default meta prompt", with the same syntax. The meta prompt values would override these (field by field), and any dedicated UI control or API field would override both.

I don't mind it that much, but the syntax is.. kinda weird, if we don't need to separate the parameters from prompt tokens. We could use instead:

command-line like: --timestep-shift 250 --scheduler sgm_uniform
variables: timestep-shift=250 scheduler=sgm_uniform or timestep-shift=250, scheduler=sgm_uniform

What do you think?

LostRuins · 2025-10-03T02:30:02Z

--sdgendefaults is fine, that way it can be set at load time and you don't need to worry about the UI. It's good to group it into one single field as 99% of users will not modify this, so having multiple fields will add clutter. To simplify then, you can just accept JSON and parse it at load.

See how --chatcompletionsadapter does it (no need to support JSON file, just inline json)

wbruna · 2025-10-03T11:39:14Z

I did the change to support JSON syntax on the sdgendefaults field, and cleaned up the code a bit. The genparams -> sd inputs still need a bit of refactoring, to allow undefined fields and avoiding special numerical values (so things like changing the default CFG or number of steps would become possible); but it's a bit pointless right now, since the UI will always set valid values anyway.

The displayed image parameters still need some polishing, since those values currently come from the requested fields, not the values actually sent to the backend (that issue was already present for e.g. the seed value). I can fix that in a separate PR, if you wish to speed things up.

EDIT: moved the prompt hack to https://github.com/wbruna/llama.cpp/tree/kcpp_sd_backend_options_hack , if needed for testing.

LostRuins · 2025-10-04T02:22:58Z

Okay looks mostly fine, can we ditch the LambdaPrinter, just add it to the metadata like normal.

wbruna · 2025-10-04T03:20:18Z

Done.

I took a look at the displayed image metadata, and it seems to me it'd need to be done both at the API (to return the true image parameters) and the UI level (to consider the replaced parameters)?

By the way, I did a bunch of tests with Qwen-Image on Vulkan; working fine so far.

The clip_l -> clip_1 change got me thinking: what if we received all, or most parameters on the C++ API, inside a single JSON string? It'd be a bit of work at first to parse it at the C++ side, but we already have access to a JSON parser, and multiple-field issues like this would be far easier to support. The Python side would get much simpler too. I'm even considering proposing something like this upstream, to help out other web interfaces.

LostRuins · 2025-10-04T03:28:46Z

Passing JSON to C++ is not ideal. the python json parser is much better than the C++ one and handling errors in python is much easier too. Especially when debugging, doing it in C++ is a pain (like you mentioned before)

btw i might have caused a conflict when shifting a variable around

wbruna · 2025-10-04T13:11:23Z

Passing JSON to C++ is not ideal. the python json parser is much better than the C++ one and handling errors in python is much easier too. Especially when debugging, doing it in C++ is a pain (like you mentioned before)

Not raw JSON from the web parameters, just as a replacement for the inputs struct. We'd still have to validate everything at the Python level, even if just to check defaults against the sgdefaults field. And it would be converted into the C++ structs on arrival anyway.

Another option that could work for the models is a pair of arrays (model type, model filename), with the type specified either an an enum or a fixed string. I'll experiment a bit with that (for another PR, though 🙂).

LostRuins · 2025-10-05T08:30:39Z

I noticed you removed this

        if (sd_params->sample_method == sample_method_t::EULER_A) {
            //euler a broken on flux
            if (!sd_is_quiet && sddebugmode) {
                printf("%s: switching Euler A to Euler\n", loaded_model_is_chroma(sd_ctx) ? "Chroma" : "Flux");
            }
            sd_params->sample_method = sample_method_t::EULER;
        }

we can't use Euler A on flux, it breaks completely. Since the frontend (e.g. SDUI) does not know this, I think we should use Euler even if the user intentionally picks Euler A in their UI, in order to avoid broken generations.

wbruna · 2025-10-05T10:20:15Z

Since the frontend (e.g. SDUI) does not know this,

I thought you had it switched to Euler?

If the default is still Euler A, I agree it's best to keep it. I'll leave it to another PR.

LostRuins · 2025-10-05T10:32:12Z

I thought you had it switched to Euler?

I did, but my point is, anyone can have their own frontend. There might be third party UIs out there we don't even know exist.

Anyway looking mostly good now so i will merge shortly

wbruna · 2025-10-09T04:02:23Z

@LostRuins , when loading a config file through the admin interface: if the file doesn't specify a parameter (for instance, because it was created with a previous version), is the parameter supposed to retain its current value?

I have noticed that values set by the sdgendefaults parameter will be persistent while changing config files, unless the new config also defines the parameter. The problem is, the timestep-shift value easily breaks image generation for normal models, so at first I was puzzled by why image gen was suddenly broken.

I don't think this behavior is necessarily wrong, at least not for all config fields (the admin password, for instance, could then be set on a single config file or command line, and kept out of most configs), but I'm not sure what would be the best way to handle this case.

LostRuins · 2025-10-09T06:58:38Z

Hmm no, the config file should be independent and not carry over any settings from a previous config. Do you notice this issue with any other parameter? As far as I've seen they are all correctly reset on load/unload of a new file.

wbruna · 2025-10-09T10:23:41Z

Hmm no, the config file should be independent and not carry over any settings from a previous config.

Where it is supposed to happen?

Looking at the code, I see reload_from_new_args iterating over the newly loaded parameters in newargs, after reload_new_config. But I don't see where the parameters present in the global args, but not in newargs, get overridden with the defaults, before the global is passed on to multiprocessing.Process.

Do you notice this issue with any other parameter? As far as I've seen they are all correctly reset on load/unload of a new file.

Now that I think about it, several weeks ago I had a similar issue while alternating between Vulkan and ROCm. I assumed I had hit a corner case related to my setup (I was used to reset everything anyway because I needed to change env vars), but it could have been exactly this: ROCm (or Vulkan, I don't remember which) getting "stuck" if not explicitly disabled by the new config, because of a parameter being renamed.

LostRuins · 2025-10-09T13:58:40Z

You are correct, let me think of a better solution.

LostRuins · 2025-10-09T15:37:10Z

@wbruna can you test if 5396e62 solves your issue?

wbruna · 2025-10-11T20:01:59Z

I thought you had it switched to Euler?

I did, but my point is, anyone can have their own frontend. There might be third party UIs out there we don't even know exist.

Back to the "Euler A isn't really Euler A" subject: could we at least switch the default sampler to the string "default", both on the interface and at the API level, even if we still keep Euler as the default sampler in koboldcpp.py? This would prevent the same Euler A issue with Euler, if any future model happens to not work well with it. It would also allow a default sampler to be set by the sgdefaults field - LCM, for instance.

LostRuins · 2025-10-17T03:23:56Z

Hey @wbruna , i'm gonna add the "Scheduler" dropdown to stableui since it seems quite a few people want to adjust the scheduler. ~~For consistency with sampler_name, I will rename the scheduler to scheduler_name~~, which will be sent from StableUI if not set to "default" which is the default.

For the sampler_name we can use "default" in the list now but i'm still unsure if putting that in first is ideal, since it does break existing cases and isnt a1111 recognized.

LostRuins · 2025-10-17T03:26:28Z

actually maybe i will keep your original name scheduler, probably better

wbruna · 2025-10-17T04:04:54Z

For the sampler_name we can use "default" in the list now but i'm still unsure if putting that in first is ideal, since it does break existing cases and isnt a1111 recognized.

What if, for the "default" sampler, the sampler_name field was simply not defined (even if it appears as "default" to the user)? I'd expect most (all?) APIs would use a default value for any omitted field.

wbruna · 2025-10-17T04:13:31Z

@wbruna can you test if 5396e62 solves your issue?

Huh, sorry, I was sure I had replied to this already, but I'm not seeing my answer here. That change did fix the 'stuck' value when loading a config with a missing field; thanks!

LostRuins · 2025-10-17T05:03:06Z

most (all?) APIs would use a default value for any omitted field.

actually i don't know lol. thats why i am trying to err on the safe side. I know for sure kobold would handle it correct, but ive already ditched a1111 so i dont know about those (whether its a mandatory field). I think it should be safe though, so we can try that

LostRuins · 2025-10-17T06:01:38Z

alright adjusted

wbruna force-pushed the kcpp_sd_backend_options branch from 68c4a8f to d3b7138 Compare October 1, 2025 15:22

wbruna force-pushed the kcpp_sd_backend_options branch from d3b7138 to 2044088 Compare October 2, 2025 11:25

wbruna marked this pull request as draft October 2, 2025 16:52

LostRuins force-pushed the concedo_experimental branch from 92df280 to f282362 Compare October 3, 2025 10:58

wbruna force-pushed the kcpp_sd_backend_options branch 2 times, most recently from 14e0b80 to 1e1c23e Compare October 3, 2025 11:35

wbruna marked this pull request as ready for review October 3, 2025 11:39

wbruna force-pushed the kcpp_sd_backend_options branch from 1e1c23e to f41a0ec Compare October 4, 2025 03:02

sd: add backend support for choosing the default sampler

79f84c8

wbruna force-pushed the kcpp_sd_backend_options branch from 724b262 to e802a79 Compare October 4, 2025 13:02

wbruna mentioned this pull request Oct 4, 2025

sd: support for JSON fields on the C++ library interface #1775

Closed

wbruna added 5 commits October 5, 2025 07:23

use the default sampler on the API

40ac7d1

sd: add backend support for the scheduler

e80ce44

sd: add backend support for distilled guidance

b44afd9

sd: add backend support for timestep-shift

9c7eef0

sd: add a config field to set default image gen options

b583890

wbruna force-pushed the kcpp_sd_backend_options branch from e802a79 to b583890 Compare October 5, 2025 10:28

LostRuins merged commit c48999f into LostRuins:concedo_experimental Oct 5, 2025

additional options for image generation #1765

additional options for image generation #1765

Uh oh!

Conversation

wbruna commented Sep 30, 2025

Uh oh!

LostRuins commented Sep 30, 2025

Uh oh!

wbruna commented Sep 30, 2025

Uh oh!

LostRuins commented Oct 1, 2025

Uh oh!

wbruna commented Oct 1, 2025

Uh oh!

wbruna commented Oct 1, 2025

Uh oh!

LostRuins commented Oct 1, 2025

Uh oh!

wbruna commented Oct 1, 2025

Uh oh!

LostRuins commented Oct 2, 2025

Uh oh!

wbruna commented Oct 2, 2025

Uh oh!

LostRuins commented Oct 2, 2025

Uh oh!

wbruna commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Oct 3, 2025

Uh oh!

wbruna commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Oct 4, 2025

Uh oh!

wbruna commented Oct 4, 2025

Uh oh!

LostRuins commented Oct 4, 2025

Uh oh!

wbruna commented Oct 4, 2025

Uh oh!

LostRuins commented Oct 5, 2025

Uh oh!

wbruna commented Oct 5, 2025

Uh oh!

LostRuins commented Oct 5, 2025

Uh oh!

wbruna commented Oct 9, 2025

Uh oh!

LostRuins commented Oct 9, 2025

Uh oh!

wbruna commented Oct 9, 2025

Uh oh!

LostRuins commented Oct 9, 2025

Uh oh!

LostRuins commented Oct 9, 2025

Uh oh!

wbruna commented Oct 11, 2025

Uh oh!

LostRuins commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Oct 17, 2025

Uh oh!

wbruna commented Oct 17, 2025

Uh oh!

wbruna commented Oct 17, 2025

Uh oh!

LostRuins commented Oct 17, 2025

Uh oh!

LostRuins commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

wbruna commented Oct 2, 2025 •

edited

Loading

wbruna commented Oct 3, 2025 •

edited

Loading

LostRuins commented Oct 17, 2025 •

edited

Loading