Skip to content

Conversation

@wbruna
Copy link

@wbruna wbruna commented Sep 30, 2025

The first commit adds support for the "default" sampler, letting sd.cpp choose according to the model (Euler for SD3 / Flux / Chroma, Euler A otherwise).

The other two are more of an experiment: they add support for specifying flags through the prompt string, using the Perchance syntax (PARAMETER:::VALUE). This way, we don't need specific support from the web interface to test them. I added support for the scheduler and timestep-shift flags, enough to make NitroFusion models work (with (timestep-shift:::250) for NitroSD Realism, (timestep-shift:::400) for Vibrant, and (scheduler:::sgm_uniform) for both).

That syntax works well because it can't really happen in a normal prompt without resulting in an incorrect prompt weighing. Another option could be following the LoRA general syntax, with <param:PARAMETER:VALUE> or something like that.

@LostRuins
Copy link
Owner

For 1, I'm fine with this but I think that Euler actually makes a good default in all cases - I have noted instability with Euler A sampling even in SDXL models, is there ANY model that does not work well with Euler? If not I think we can just use that as a default. In fact I think that's the default for the webUi and the API if unspecified too. I'm ok with either.

For 2, not really a fan of this approach. The prompt field should be considered "sandboxed" and not be overloadable with anything that affects generation behavior (indeed that's partially why I didn't want the lora switching there either). If you need addition gen-time parameters, I think it's best to add them to the generation API (which will be well known)

@wbruna
Copy link
Author

wbruna commented Sep 30, 2025

For 1, I'm fine with this but I think that Euler actually makes a good default in all cases - I have noted instability with Euler A sampling even in SDXL models, is there ANY model that does not work well with Euler?

I don't think it's a bad default. I only know about an admittedly very obscure corner case: SDXL DMD2 models (which are intended to be used with LCM) usually work OK with Euler A, but tend to make noisy outputs with Euler.

Aside: in that same line, I miss a way to reset the interface to "known to be good" default parameters: say, choosing "LCM, 10 steps, CFG 1" for LCM models. But that's a bit too much front-end development for me...

If not I think we can just use that as a default. In fact I think that's the default for the webUi and the API if unspecified too. I'm ok with either.

The backend's default itself is Euler A right now (forcibly changed to Euler for SD3/Flux); I don't remember if koboldcpp.py sets its own default.

My main argument would be from a least-surprises point of view: if the user know the "default" is good for mainline sd.cpp, that same string (or not specifying it) will give them similar results on Koboldcpp. The same for development on our side: "default" would simply mean "sd.cpp's responsibility" (like it is right now with the default scheduler), even if a new model comes up with completely different parameters for "good" results. So a "library default" would be a bit more future-proof (eg. for Qwen Image).

The same argument suggests we shouldn't forcibly change Euler A to Euler, because users see "Euler A" on the interface, and could then learn that "Euler A is OK for Flux". So, with either Euler or "sd.cpp default" as a default sampler, I think it's best to disable that forceful change; I can add a commit to this PR to do that.

For 2, not really a fan of this approach. The prompt field should be considered "sandboxed" and not be overloadable with anything that affects generation behavior (indeed that's partially why I didn't want the lora switching there either). If you need addition gen-time parameters, I think it's best to add them to the generation API (which will be well known)

Fair enough. I don't disagree the prompt isn't really the best place to do it. It's just very convenient 🙂

Main motivation right now: NitroFusion models are very fast (under 2s on my card) , and give much better results than other 1-step models like SDXL Turbo, so they'd be ideal for scene generation in an adventure chat. But they need a different scheduler (like sgm_uniform) and an additional parameter (timestep-shift), which right now can't be specified either at model loading time or on the chat's media settings tab. And the prompt is already there...

The scheduler arguably goes hand-in-hand with the sampler, and could even be specified in the same field if we managed to keep the number of combinations under control ("Euler A sgm_uniform", "Heun SmoothStep", "Euler Karras", etc). But timestep-shift is both obligatory for NitroFusion models (and the same value doesn't work for different models), and not really needed in most cases (although it can be specified for non-NitroFusion models, with interesting results). So... how do we specify it on the web interface without making it cluttered for all other uses?

Would you consider OK a specialized "parameter prompt", which only receives white-listed key-value pairs like the code here? I don't care much about the syntax (could be like this, or JSON, "KEY:VALUE ", even command-line like), but the whole add-a-separate-control-for-each-field simply doesn't scale.

@LostRuins
Copy link
Owner

Would you consider OK a specialized "parameter prompt", which only receives white-listed key-value pairs like the code here?

Yeah, if needed I suppose adding an extra "meta" field would work for that. Though, did you need this in the UI as well? Or only in the API?

Also, would it be possible to simply hard code good hardcoded values for each architecture type? (if possible that would be the easiest option)

@wbruna
Copy link
Author

wbruna commented Oct 1, 2025

Would you consider OK a specialized "parameter prompt", which only receives white-listed key-value pairs like the code here?

Yeah, if needed I suppose adding an extra "meta" field would work for that. Though, did you need this in the UI as well? Or only in the API?

It's almost only useful at the UI level. If we had another way of specifying arbitrary key+value fields on the UI, we could simply make it set different API fields.

The only reason for API-level support would be to help simplifying the UI for that implementation.

Also, would it be possible to simply hard code good hardcoded values for each architecture type? (if possible that would be the easiest option)

Sometimes. In the NitroFusion case, for instance, the scheduler doesn't matter that much, as long as you choose from a specific subset; in principle, we could just hardcode "sgm_uniform". Problem is, they look like normal SDXL models, so we'd have to find a way to identify them. The timestep-shift value is even worse, since that seemingly depends only on model training. Setting defaults at loading time would work... but then we get the same "UI fields" problem, on Tk 🙂

@wbruna
Copy link
Author

wbruna commented Oct 1, 2025

Just digging a little bit deeper onto another point:

The prompt field should be considered "sandboxed" and not be overloadable with anything that affects generation behavior (indeed that's partially why I didn't want the lora switching there either).

The LoRA support is a bit of a worst offender, since it can load arbitrary (leejet/stable-diffusion.cpp#830) files on the server. But for some other cases, I'm not entirely sure.

For instance, you can specify

high contrast,

at the prompt level. And if the model understood that, something like

(contrast_level:10),

would be even better. And this is not that different from

(cfg_scale:::10),

as long as we ensure there are no "malicious" values (say, a negative or INFINITY causing a server crash - which is something API fields and token weight parsing already need to do), and it can't be used for circumventing parsers or filters (a convincing example is a bit hard here, but someone could make use of the parameter removal to smuggle a forbidden word past a filter, like "NS(cfg_scale:10)FW" ). And since the result will just fill up some specific image generation fields, the worst that could happen would be a bad generation.

Of course, we may not want the user having access to those fields; this should be processed before that restriction is applied. A separate 'meta' prompt field could help here.

@LostRuins
Copy link
Owner

Another reason for avoiding generation config in the prompt is also the input might be from unknown sources or unsanitized. For example, the img prompt could be based off the output of a LLM, or fed forward from some search results. The user might not want such sources affecting gen params (imagine if your LLM could affect step count from the prompt by saying "render 5000 more steps")

@wbruna wbruna force-pushed the kcpp_sd_backend_options branch from 68c4a8f to d3b7138 Compare October 1, 2025 15:22
@wbruna
Copy link
Author

wbruna commented Oct 1, 2025

I kept the prompt parsing because it's the only way right now to test the new fields, but moved it to the Python level on the last revision, so it can be easily discarded. Note it's not ready for inclusion even if you'd agree to it, since I didn't whitelist the fields; I'll turn it into a proper implementation if/once we add a meta prompt.

I still need to figure out how to properly add them to the API. Another thing: do we have a ready-to-use script or curl prompt to test the API directly?

(by the way, I rebased this onto #1769 just to make it easier to test with Flux and Chroma; that PR should be uncontroversial enough to be included first )

Another reason for avoiding generation config in the prompt is also the input might be from unknown sources or unsanitized. For example, the img prompt could be based off the output of a LLM, or fed forward from some search results. The user might not want such sources affecting gen params (imagine if your LLM could affect step count from the prompt by saying "render 5000 more steps")

I agree parameters that directly affect resource usage (like step count and resolution) should be kept out of it anyway. For the normal prompt, I'd simply whitelist parameters known to only affect image results (cfg_scale and sampler would be borderline according to this, since they could change the step duration).

@LostRuins
Copy link
Owner

If you build koboldcpp, you can test it via the swagger API page docs at http://localhost:5001/api

adding a new field is as simple as reading it inside the sd_generate function, take a look at the most recently added field avi_video for an example. It can be undocumented for now.

I'm a bit confused though, cause you mentioned you wanted (arbitary?) extra field support through a "meta" parameter, but then now you're adding them as individual named items again? What is hack_prompt_options for?

I thought you were going for:

input.prompt = "a happy white cat"
input.negative_prompt = "ugly, deformed"
input.meta = "(timestep-shift:::250)(scheduler:::sgm_uniform)"

whereafter the input.meta field gets parsed C++ side like you did originally.

@wbruna
Copy link
Author

wbruna commented Oct 2, 2025

If you build koboldcpp, you can test it via the swagger API page docs at http://localhost:5001/api

adding a new field is as simple as reading it inside the sd_generate function, take a look at the most recently added field avi_video for an example. It can be undocumented for now.

Thanks, I'll look into it.

I'm a bit confused though, cause you mentioned you wanted (arbitary?) extra field support through a "meta" parameter, but then now you're adding them as individual named items again? What is hack_prompt_options for?

I thought you were going for:

input.prompt = "a happy white cat"
input.negative_prompt = "ugly, deformed"
input.meta = "(timestep-shift:::250)(scheduler:::sgm_uniform)"

whereafter the input.meta field gets parsed C++ side like you did originally.

That syntax (with a renamed timestep-shift -> shifted_timesteps for Python dictionary reasons) is still supported, just applied at the Python level. I've kept it at the main prompt just because a new meta_prompt can't be tested right now.

The move is mainly for keeping the functional part (generation parameters) separate from the user and API facing part (separate fields / prompt / meta prompt), because rebasing and changing the C++ code to adapt that approach is costly (I have to fully rebuild Koboldcpp every time there is a Vulkan shader change).

Separate C++ fields may also make sense if we decide to change the approach: for instance, setting a fixed timestep-shift value at model loading time.

In a nutshell, my targets are:

  • a good and fast way to generate images in Chroma
  • a way to use Nitro models on a Koboldcpp adventure chat

The prompt parsing thing is just an approach to reach that: I need some way to change the timestep-shift and scheduler through the Koboldcpp UI, to see the resulting images on the Koboldcpp interface.

We have several separate questions at this point:

  • are these extra parameters useful enough to be supported?
  • is it useful to be able to add new parameters without an UI change / explicit UI support?
  • are some of these parameters useful enough to include as full separate controls at the UI level? (this could be the case for the scheduler, I don't know)

The hack_prompt_options is an attempt to answer those, by allowing us to test them right now, before I learn how to do UI/API changes enough to make it properly available through the UI.

( an aside: from a technical POV (say, if I was designing it from the ground), it doesn't make any sense to cram parameters into a new syntax at the API level: if there's software at the other end, it's simpler and less error-prone to use separate fields. That prompt syntax is a user thing, it should be done at the UI level. But I'm not sure my non-existent Vue/Javascript skills would be enough for that approach )

@wbruna wbruna force-pushed the kcpp_sd_backend_options branch from d3b7138 to 2044088 Compare October 2, 2025 11:25
@LostRuins
Copy link
Owner

Yeah the idea with the single "meta" field is it can, like you said, use a simple text box for it's UI. It's like a parameter garbage bin, you just throw all the stuffing inside and the backend sorts out what it can interpret, ignoring what it cannot, while keeping the main prompt clean

@wbruna wbruna marked this pull request as draft October 2, 2025 16:52
@wbruna
Copy link
Author

wbruna commented Oct 2, 2025

I included an additional config field to set default image generation options, as a simpler way to be able to customize the parameters for the adventure+nitro case. If you think it's a good feature, I can work on polishing it first, and leave the meta_prompt field (UI and all) for later, possibly a follow-up PR.

Edit: to be clearer, my idea is it'd work as a "default meta prompt", with the same syntax. The meta prompt values would override these (field by field), and any dedicated UI control or API field would override both.

I don't mind it that much, but the syntax is.. kinda weird, if we don't need to separate the parameters from prompt tokens. We could use instead:

  • command-line like: --timestep-shift 250 --scheduler sgm_uniform
  • variables: timestep-shift=250 scheduler=sgm_uniform or timestep-shift=250, scheduler=sgm_uniform

What do you think?

@LostRuins
Copy link
Owner

--sdgendefaults is fine, that way it can be set at load time and you don't need to worry about the UI. It's good to group it into one single field as 99% of users will not modify this, so having multiple fields will add clutter. To simplify then, you can just accept JSON and parse it at load.

See how --chatcompletionsadapter does it (no need to support JSON file, just inline json)

@LostRuins LostRuins force-pushed the concedo_experimental branch from 92df280 to f282362 Compare October 3, 2025 10:58
@wbruna wbruna force-pushed the kcpp_sd_backend_options branch 2 times, most recently from 14e0b80 to 1e1c23e Compare October 3, 2025 11:35
@wbruna
Copy link
Author

wbruna commented Oct 3, 2025

I did the change to support JSON syntax on the sdgendefaults field, and cleaned up the code a bit. The genparams -> sd inputs still need a bit of refactoring, to allow undefined fields and avoiding special numerical values (so things like changing the default CFG or number of steps would become possible); but it's a bit pointless right now, since the UI will always set valid values anyway.

The displayed image parameters still need some polishing, since those values currently come from the requested fields, not the values actually sent to the backend (that issue was already present for e.g. the seed value). I can fix that in a separate PR, if you wish to speed things up.

EDIT: moved the prompt hack to https://github.com/wbruna/llama.cpp/tree/kcpp_sd_backend_options_hack , if needed for testing.

@wbruna wbruna marked this pull request as ready for review October 3, 2025 11:39
@LostRuins
Copy link
Owner

Okay looks mostly fine, can we ditch the LambdaPrinter, just add it to the metadata like normal.

@wbruna wbruna force-pushed the kcpp_sd_backend_options branch from 1e1c23e to f41a0ec Compare October 4, 2025 03:02
@wbruna
Copy link
Author

wbruna commented Oct 4, 2025

Done.

I took a look at the displayed image metadata, and it seems to me it'd need to be done both at the API (to return the true image parameters) and the UI level (to consider the replaced parameters)?

By the way, I did a bunch of tests with Qwen-Image on Vulkan; working fine so far.

The clip_l -> clip_1 change got me thinking: what if we received all, or most parameters on the C++ API, inside a single JSON string? It'd be a bit of work at first to parse it at the C++ side, but we already have access to a JSON parser, and multiple-field issues like this would be far easier to support. The Python side would get much simpler too. I'm even considering proposing something like this upstream, to help out other web interfaces.

@LostRuins
Copy link
Owner

Passing JSON to C++ is not ideal. the python json parser is much better than the C++ one and handling errors in python is much easier too. Especially when debugging, doing it in C++ is a pain (like you mentioned before)

btw i might have caused a conflict when shifting a variable around

@wbruna wbruna force-pushed the kcpp_sd_backend_options branch from 724b262 to e802a79 Compare October 4, 2025 13:02
@wbruna
Copy link
Author

wbruna commented Oct 4, 2025

Passing JSON to C++ is not ideal. the python json parser is much better than the C++ one and handling errors in python is much easier too. Especially when debugging, doing it in C++ is a pain (like you mentioned before)

Not raw JSON from the web parameters, just as a replacement for the inputs struct. We'd still have to validate everything at the Python level, even if just to check defaults against the sgdefaults field. And it would be converted into the C++ structs on arrival anyway.

Another option that could work for the models is a pair of arrays (model type, model filename), with the type specified either an an enum or a fixed string. I'll experiment a bit with that (for another PR, though 🙂).

@LostRuins
Copy link
Owner

I noticed you removed this

        if (sd_params->sample_method == sample_method_t::EULER_A) {
            //euler a broken on flux
            if (!sd_is_quiet && sddebugmode) {
                printf("%s: switching Euler A to Euler\n", loaded_model_is_chroma(sd_ctx) ? "Chroma" : "Flux");
            }
            sd_params->sample_method = sample_method_t::EULER;
        }

we can't use Euler A on flux, it breaks completely. Since the frontend (e.g. SDUI) does not know this, I think we should use Euler even if the user intentionally picks Euler A in their UI, in order to avoid broken generations.

@wbruna
Copy link
Author

wbruna commented Oct 5, 2025

Since the frontend (e.g. SDUI) does not know this,

I thought you had it switched to Euler?

If the default is still Euler A, I agree it's best to keep it. I'll leave it to another PR.

@wbruna wbruna force-pushed the kcpp_sd_backend_options branch from e802a79 to b583890 Compare October 5, 2025 10:28
@LostRuins
Copy link
Owner

I thought you had it switched to Euler?

I did, but my point is, anyone can have their own frontend. There might be third party UIs out there we don't even know exist.

Anyway looking mostly good now so i will merge shortly

@LostRuins LostRuins merged commit c48999f into LostRuins:concedo_experimental Oct 5, 2025
@wbruna
Copy link
Author

wbruna commented Oct 9, 2025

@LostRuins , when loading a config file through the admin interface: if the file doesn't specify a parameter (for instance, because it was created with a previous version), is the parameter supposed to retain its current value?

I have noticed that values set by the sdgendefaults parameter will be persistent while changing config files, unless the new config also defines the parameter. The problem is, the timestep-shift value easily breaks image generation for normal models, so at first I was puzzled by why image gen was suddenly broken.

I don't think this behavior is necessarily wrong, at least not for all config fields (the admin password, for instance, could then be set on a single config file or command line, and kept out of most configs), but I'm not sure what would be the best way to handle this case.

@LostRuins
Copy link
Owner

Hmm no, the config file should be independent and not carry over any settings from a previous config. Do you notice this issue with any other parameter? As far as I've seen they are all correctly reset on load/unload of a new file.

@wbruna
Copy link
Author

wbruna commented Oct 9, 2025

Hmm no, the config file should be independent and not carry over any settings from a previous config.

Where it is supposed to happen?

Looking at the code, I see reload_from_new_args iterating over the newly loaded parameters in newargs, after reload_new_config. But I don't see where the parameters present in the global args, but not in newargs, get overridden with the defaults, before the global is passed on to multiprocessing.Process.

Do you notice this issue with any other parameter? As far as I've seen they are all correctly reset on load/unload of a new file.

Now that I think about it, several weeks ago I had a similar issue while alternating between Vulkan and ROCm. I assumed I had hit a corner case related to my setup (I was used to reset everything anyway because I needed to change env vars), but it could have been exactly this: ROCm (or Vulkan, I don't remember which) getting "stuck" if not explicitly disabled by the new config, because of a parameter being renamed.

@LostRuins
Copy link
Owner

You are correct, let me think of a better solution.

@LostRuins
Copy link
Owner

@wbruna can you test if 5396e62 solves your issue?

@wbruna
Copy link
Author

wbruna commented Oct 11, 2025

I thought you had it switched to Euler?

I did, but my point is, anyone can have their own frontend. There might be third party UIs out there we don't even know exist.

Back to the "Euler A isn't really Euler A" subject: could we at least switch the default sampler to the string "default", both on the interface and at the API level, even if we still keep Euler as the default sampler in koboldcpp.py? This would prevent the same Euler A issue with Euler, if any future model happens to not work well with it. It would also allow a default sampler to be set by the sgdefaults field - LCM, for instance.

@LostRuins
Copy link
Owner

LostRuins commented Oct 17, 2025

Hey @wbruna , i'm gonna add the "Scheduler" dropdown to stableui since it seems quite a few people want to adjust the scheduler. For consistency with sampler_name, I will rename the scheduler to scheduler_name, which will be sent from StableUI if not set to "default" which is the default.

For the sampler_name we can use "default" in the list now but i'm still unsure if putting that in first is ideal, since it does break existing cases and isnt a1111 recognized.

image image

@LostRuins
Copy link
Owner

actually maybe i will keep your original name scheduler, probably better

@wbruna
Copy link
Author

wbruna commented Oct 17, 2025

For the sampler_name we can use "default" in the list now but i'm still unsure if putting that in first is ideal, since it does break existing cases and isnt a1111 recognized.

What if, for the "default" sampler, the sampler_name field was simply not defined (even if it appears as "default" to the user)? I'd expect most (all?) APIs would use a default value for any omitted field.

@wbruna
Copy link
Author

wbruna commented Oct 17, 2025

@wbruna can you test if 5396e62 solves your issue?

Huh, sorry, I was sure I had replied to this already, but I'm not seeing my answer here. That change did fix the 'stuck' value when loading a config with a missing field; thanks!

@LostRuins
Copy link
Owner

most (all?) APIs would use a default value for any omitted field.

actually i don't know lol. thats why i am trying to err on the safe side. I know for sure kobold would handle it correct, but ive already ditched a1111 so i dont know about those (whether its a mandatory field). I think it should be safe though, so we can try that

@LostRuins
Copy link
Owner

alright adjusted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants