Add Image to Image option #16

Xuzzo · 2024-08-21T21:39:25Z

Hi,
this PR:

introduces the possibility to modify another image through flux1
fixes a bug with width <-> height (they were swapped in config init)
adds main_img2img for generation. Note: not sure this is needed, we could probably unify cli and keep only one main

filipstrand · 2024-08-24T09:12:42Z

@Xuzzo Again, thanks for the contribution. Very nice that you spotted and fixed the height/width mistake.

Regarding this feature, do you have a good example of it working? I only tried it quickly with the schnell model for 4 steps, and did not get any special result from it (I basically got back the input image with minor variations). Maybe it might not work as well with the distilled schnell model??

Xuzzo · 2024-08-24T10:10:36Z

Using the schnell model I found it is quite important to tune the strength parameter. For example:

Original image:

CLI:
python main_img2img.py --prompt "Luxury food photograph" --steps 10 --seed 2 --base-image image.png --strength 0.3

Result:

I found the dev to be a bit less sensitive to strength.

filipstrand · 2024-08-24T10:32:11Z

Using the schnell model I found it is quite important to tune the strength parameter. For example:

Original image:

CLI: python main_img2img.py --prompt "Luxury food photograph" --steps 10 --seed 2 --base-image image.png --strength 0.3

Result:

I found the dev to be a bit less sensitive to strength.

Oh great, this is a very nice example! I think we should add this to the readme.

filipstrand · 2024-08-24T10:54:42Z

Btw, I have looked at your PR, and will share some comments later today. Thanks for the patience :)

filipstrand · 2024-08-25T09:22:47Z

src/flux_1/config/config.py

+        self.inference_steps = list(range(num_inference_steps))
+        self.guidance = guidance
+
+class ConfigImg2Img:


More of a personal preference, but I would prefer one file per config class.

my current position is we probably don't need multiple config classes that map 1 config to 1 CLI entry point.

filipstrand · 2024-08-25T09:23:32Z

src/flux_1/config/config.py

+    def __init__(
+            self,
+            num_inference_steps: int = 4,
+            width: int = 1024,


With the img2img option, does it make sense to also provide a height and width? I am thinking that it will always be based on the input base image?

this depends on the training resolution of Flux, my understanding (from reading forums) is that snapping the output image to the training resolutions gets better results. Worth doing some evals here, or just follow evals from similar projects upstream.

filipstrand · 2024-08-25T09:31:21Z

main_img2img.py

+    ImageUtil.save_image(image, args.output)
+
+
+if __name__ == '__main__':


Regarding main_img2img.py I think I prefer this option, instead of unifying things too much at this point (not fully sure how often this feature will be used compared to text2img), so I actually prefer to keep this a bit separate for now as you have done

after working on my own take in anthonywu#1 - my current opinion is the config classes should be mixins, instead of RuntimeConfig being parent object of ModelConfig, etc.

filipstrand · 2024-08-25T09:59:26Z

src/flux_1/flux.py

+        image_latents = self._pack_latents(image_latents, runtime_config.height, runtime_config.width)
+        latents = runtime_config.sigmas[config.init_timestep] * noise + (1.0 - runtime_config.sigmas[config.init_timestep]) * image_latents
+
+        return self._generate_from_latents(latents, prompt, runtime_config)


This is also a bit more of a personal opinion, but I think I even would prefer a separate flux_img2img.py file that holds the new class at this point. Right now, I would also not even subclass it and deal with some code duplication for the img2img case.

I guess the thing I like is for the standard text2img workflow to be as straight forward as possible and I also kinda like being able to view the whole generate_image method at once, without it delegating to another method like _generate_from_latents. This would mean some more code duplication right now, but I think it is OK at the moment.

Based on your img2img logic, I have made a branch based on this one of what this could look like.

To be clear, since you have added the extra logic for img2img, I think we should continue on your current branch here, but make some changes similar to how it looks like in my other branch.

Would be nice to hear your opinions on this, and also let me know if you have the time or not to do it, otherwise I can add some commits on top of your branch here :)

No problem for me. Feel free to push directly here

iLoveBug · 2024-10-08T03:37:01Z

any progress for this?

anthonywu · 2024-10-09T20:25:38Z

👍🏼 to getting img2img implemented, though this change would need to be re-constructed on latest main changes. I propose doing this after #70 and #66 and maybe another refactoring where the Flux classes become sub-classes of a general FluxBase where shared implementation go.

anthonywu · 2024-10-11T06:19:15Z

I have a draft PR that re-constructs this PR using some WIP-refactoring code.

Open to review: https://github.com/anthonywu/mflux/pull/1/files

Interface is ready. The image generation is buggy, I think I'm a "few lines of code" away from getting it to work, just ran out of time today. The structure of the re-implementation is worth a preview though for anyone interested in helping expedite this.

anthonywu

thanks to @Xuzzo's ref implementation here I found a bug fix for my PR over at anthonywu#1

mflux-generate --model dev --init-image cake.png --init-image-strength 0.25 --steps 12 --prompt "cake, blue frosting, bananas and cherries, blue flower pot with yellow sunflowers in the background, plated with gold leafs"

init image

output image

composite of stepwise

The above result was achieved with dev and 10+ steps.

In contrast, I'm finding that schnell with low number of steps like 4 isn't sufficient to alter the image meaningfully. Unsure whether image to image should be recommended for schnell

anthonywu · 2024-10-11T18:29:47Z

main_img2img.py

+    ImageUtil.save_image(image, args.output)
+
+
+if __name__ == '__main__':


after working on my own take in anthonywu#1 - my current opinion is the config classes should be mixins, instead of RuntimeConfig being parent object of ModelConfig, etc.

anthonywu · 2024-10-11T18:30:43Z

src/flux_1/config/config.py

+        self.inference_steps = list(range(num_inference_steps))
+        self.guidance = guidance
+
+class ConfigImg2Img:


my current position is we probably don't need multiple config classes that map 1 config to 1 CLI entry point.

anthonywu · 2024-10-11T18:32:19Z

src/flux_1/config/config.py

+    def __init__(
+            self,
+            num_inference_steps: int = 4,
+            width: int = 1024,


this depends on the training resolution of Flux, my understanding (from reading forums) is that snapping the output image to the training resolutions gets better results. Worth doing some evals here, or just follow evals from similar projects upstream.

Xuzzo · 2024-10-16T07:48:07Z

Thanks @anthonywu for moving this forward. Im closing this PR and leaving all development to #77

Xuzzo added 9 commits August 19, 2024 17:21

WIP img2img

846a3b4

fix img2img

0e0e711

Merge branch 'main' into feature/add_img2img

ff09551

clean img2img class

f62c43a

fix width <-> height in config, and everywhere else

6e0c03a

improve config for img2img

d313b4c

Merge branch 'main' into feature/add_img2img

385ecd6

remove debug and unused imports

b6eb283

remove schnell config

4dd9f33

filipstrand reviewed Aug 25, 2024

View reviewed changes

Xuzzo mentioned this pull request Sep 5, 2024

Fix width and height in config #31

Merged

anthonywu reviewed Oct 11, 2024

View reviewed changes

anthonywu mentioned this pull request Oct 12, 2024

mflux-generate – support image-to-image #77

Merged

Xuzzo closed this Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Image to Image option #16

Add Image to Image option #16

Xuzzo commented Aug 21, 2024 •

edited

Loading

filipstrand commented Aug 24, 2024

Xuzzo commented Aug 24, 2024

filipstrand commented Aug 24, 2024

filipstrand commented Aug 24, 2024

filipstrand Aug 25, 2024

anthonywu Oct 11, 2024

filipstrand Aug 25, 2024

anthonywu Oct 11, 2024

filipstrand Aug 25, 2024

anthonywu Oct 11, 2024

filipstrand Aug 25, 2024

Xuzzo Aug 25, 2024

iLoveBug commented Oct 8, 2024

anthonywu commented Oct 9, 2024

anthonywu commented Oct 11, 2024

anthonywu left a comment •

edited

Loading

anthonywu Oct 11, 2024

anthonywu Oct 11, 2024

anthonywu Oct 11, 2024

Xuzzo commented Oct 16, 2024

		ImageUtil.save_image(image, args.output)


		if __name__ == '__main__':

Add Image to Image option #16

Add Image to Image option #16

Conversation

Xuzzo commented Aug 21, 2024 • edited Loading

filipstrand commented Aug 24, 2024

Xuzzo commented Aug 24, 2024

filipstrand commented Aug 24, 2024

filipstrand commented Aug 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iLoveBug commented Oct 8, 2024

anthonywu commented Oct 9, 2024

anthonywu commented Oct 11, 2024

anthonywu left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Xuzzo commented Oct 16, 2024

Xuzzo commented Aug 21, 2024 •

edited

Loading

anthonywu left a comment •

edited

Loading