Skip to content

Pipeline: obtain per-image reproducibility seeds when generating multiple images #208

@pcuenca

Description

@pcuenca

People are interested in the following workflow: generate a few images from a prompt, select the one you like and tweak the prompt to steer it towards your goal. There are many examples in Twitter and the Discord bot, see context section below.

I managed to replicate this workflow in a demo Space generating one image at a time, and manually setting the seed before each generation. This is sub-optimal, especially when considering high-load services that need to batch requests from several users in order to run them in parallel.

Given that the pipeline is already designed to receive a Generator, I'd like to explore if we can (optionally) return per-image seeds as part of the generation process.

Describe the solution you'd like
I'd like the following to work:

ldm = DiffusionPipeline.from_pretrained(MODEL_ID)
rng = torch.manual_seed(1337)
prompts = ["Labrador in the style of Vermeer"] * 6
preds = ldm(prompts, generator=rng, return_seeds=True)
images, seeds = preds["sample"], preds["seed"]

rng.manual_seed(seeds[3])
tweaked_image = ldm("Labrador in the style of Hokusai", generator=rng)

The same thing should work when batching prompts from several users.

Perhaps something like this is already possible and I haven't looked enough.

One idea is to iterate here instead of creating the latents all at once. It would have a performance penalty, but I presume it should be negligible compared to the rest of the process. A drawback of this approach is that we'd need to do it for every pipeline, or try to create a helper.

Describe alternatives you've considered

  • Do nothing, and use the low-level API instead of the pipeline to achieve this goal (I think it should be possible, I plan to test it later). Implementing it in the pipeline would just make it easier for some users to experiment on their own.
  • Iterate to obtain just a single image per generation. This wastes GPU memory and is slower in most cases. But it might not be a problem for users that just want to experiment.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions