-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Description
People are interested in the following workflow: generate a few images from a prompt, select the one you like and tweak the prompt to steer it towards your goal. There are many examples in Twitter and the Discord bot, see context section below.
I managed to replicate this workflow in a demo Space generating one image at a time, and manually setting the seed before each generation. This is sub-optimal, especially when considering high-load services that need to batch requests from several users in order to run them in parallel.
Given that the pipeline is already designed to receive a Generator, I'd like to explore if we can (optionally) return per-image seeds as part of the generation process.
Describe the solution you'd like
I'd like the following to work:
ldm = DiffusionPipeline.from_pretrained(MODEL_ID)
rng = torch.manual_seed(1337)
prompts = ["Labrador in the style of Vermeer"] * 6
preds = ldm(prompts, generator=rng, return_seeds=True)
images, seeds = preds["sample"], preds["seed"]
rng.manual_seed(seeds[3])
tweaked_image = ldm("Labrador in the style of Hokusai", generator=rng)The same thing should work when batching prompts from several users.
Perhaps something like this is already possible and I haven't looked enough.
One idea is to iterate here instead of creating the latents all at once. It would have a performance penalty, but I presume it should be negligible compared to the rest of the process. A drawback of this approach is that we'd need to do it for every pipeline, or try to create a helper.
Describe alternatives you've considered
- Do nothing, and use the low-level API instead of the pipeline to achieve this goal (I think it should be possible, I plan to test it later). Implementing it in the pipeline would just make it easier for some users to experiment on their own.
- Iterate to obtain just a single image per generation. This wastes GPU memory and is slower in most cases. But it might not be a problem for users that just want to experiment.
Additional context
- This request
- Discord:

(screenshot credit: @osanseviero)