add PAG support for SD Controlnet Img2Img #8864

Bhavay-2001 · 2024-07-14T17:59:11Z

What does this PR do?

Part of #8710

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Merge branch 'SD_ControlNet_PAG_Img2Img' of https://github.com/Bhavay-2001/diffusers into SD_ControlNet_PAG_Img2Img :wq :wq! !wq :wq! :!wq :q! :wq

Bhavay-2001 · 2024-07-15T10:14:59Z

Hi @a-r-r-o-w, I am trying out this code to produce samples with StableDiffusionControlNetPagImg2ImgPipeline but it's giving me some error. Can you pls check this code sample once?

import numpy as np
import torch
import cv2
from PIL import Image

from diffusers import AutoPipelineForImage2Image, ControlNetModel
from diffusers.utils import load_image

# download an image
image = load_image(
    "https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png"
)
image = np.array(image)

# get canny image
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)

# load control net and stable diffusion v1-5
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16)
pipe = AutoPipelineForImage2Image.from_pretrained(
     "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, variant="fp16", enable_pag=True
).to("cuda")

# generate image
generator = torch.manual_seed(0)
image_out = pipe(
     "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting",
     num_inference_steps=20,
     generator=generator,
     guidance_scale=2.0,
     image=canny_image,
     pag_scale=3.0,
).images[0]

Error - ValueError: AutoPipeline can't find a pipeline linked to StableDiffusionControlNetPAGPipeline for stable-diffusion-controlnet-pag.

I think that this error is mainly because the changes that I have made haven't been integrated into the library yet soo that's why it is showing me this. Can you pls help me out here?

a-r-r-o-w

It is starting to look better and has many of the correct PAG-related changes as compared to previous PR. Thanks!

Some thoughts and corrections:

The goal of an image-to-image controlnet pipeline is to be able to take in a text prompt, control image and input image for generating a new image with similar style and structure. The example code that you provide doesn't make use of an input image and control_image is incorrectly passed to input image.
Your code doesn't have to be integrated in the library for it to work. You can install your own diffusers branch with your changes using pip install -e . in the root diffusers directory.
Please take a look at the implementation of the original StableDiffusionControlNetImg2ImgPipeline here and use the example code present there:

diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_img2img.py

Line 48 in bbd2f9d

EXAMPLE_DOC_STRING = """

. Make sure your PAG implementation is runnable with AutoPipelineForImageToImage(..., controlnet=controlnet, enable_pag=True)
Ensure that things like IP Adapters work because the code path for processing ip adapter image/embeds is significantly altered after introducing PAG-related changes. Also ensure that single controlnet as well as multiple controlnets work. The Diffusers documentation/PRs will have abundant examples of how to make this possible.
It is not feasible for us to help with debugging unfortunately. Most, if not all, the pipelines here should be runnable in a free-tier colab so please try to debug it there. Make sure to enable optimizations and run in fp16 if you're facing OOM. We can only assist with a final check unless the observed behaviour is really bizarre. Clearly, there are a few bugs that are obvious from first glance but I'd be happy to help once this has reached a more complete state.

a-r-r-o-w · 2024-07-15T10:26:12Z

src/diffusers/pipelines/pag/pipeline_pag_controlnet_sd_img2img.py

+                latent_model_input = (
+                    torch.cat([latents] * (prompt_embeds.shape[0] // latents.shape[0]))
+                    if self.do_classifier_free_guidance
+                    else latents
+                )


This is incorrect. Please refer to other PAG pipelines to see how it's done

a-r-r-o-w · 2024-07-15T10:27:09Z

src/diffusers/pipelines/pag/pipeline_pag_controlnet_sd_img2img.py

+        added_cond_kwargs = (
+            {"image_embeds": image_embeds}
+            if ip_adapter_image is not None or ip_adapter_image_embeds is not None
+            else None
+        )


This is incorrect because image_embeds is no longer assigned anywhere. It must be ip_adapter_image_embeds because that is what's prepared above. Please refer to other PAG PRs carefully.

src/diffusers/pipelines/pag/pipeline_pag_controlnet_sd_img2img.py

Bhavay-2001 · 2024-07-16T10:35:41Z

Hi, I made the required changes that you suggested. I tried to debug the code and turns out that the value of controlnet_conditioning_scale parameter is in the format of list instead of float. I tried to see why is such the case but cannot find it. It works fine with the non PAG pipeline though.

Bhavay-2001 · 2024-07-25T17:48:59Z

Hi @tolgacangoz, I know this is not related to you but could you pls just look through this PR once? I don't know why this is happening but before the pipeline calls self.check_inputs function, the value of the parameter `controlnet_conditioning_scale is what we have set it to be.

But inside that function check_inputs, the value of the parameter changes. I cannot figure out where does the code goes wrong. Could you pls help me figure out?

Thanks

a-r-r-o-w

heading to bed and will do a deeper review later, but this might be the bug. callback and callback steps were removed so no need to check them here

src/diffusers/pipelines/pag/pipeline_pag_controlnet_sd_img2img.py

…-2001/diffusers into SD_ControlNet_PAG_Img2Img

a-r-r-o-w · 2024-07-28T11:19:45Z

thanks Bhavay, awesome work! it looks good to me now and i don't see anything incorrect from a glance. could you post some results with the reproducible code too please, like #8861? no need for an ablation of layer-wise applying PAG but general outputs for same seed/prompt, different pag and cfg scales would be cool

i think you might need to run make style and make fix-copies for the tests to pass

HuggingFaceDocBuilderDev · 2024-07-28T11:21:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Bhavay-2001 · 2024-07-29T07:55:23Z

thanks Bhavay, awesome work! it looks good to me now and i don't see anything incorrect from a glance. could you post some results with the reproducible code too please, like #8861? no need for an ablation of layer-wise applying PAG but general outputs for same seed/prompt, different pag and cfg scales would be cool

i think you might need to run make style and make fix-copies for the tests to pass

I tried running this example in a kaggle notebook and I am getting the error with controlnet input. I am trying to figure out the error there.

a-r-r-o-w · 2024-07-29T07:57:29Z

I tried running this example in a kaggle notebook and I am getting the error with controlnet input. I am trying to figure out the error there.

unable to access, could you make it public?

Bhavay-2001 · 2024-07-29T08:32:30Z

I made it public. If you still face issue, I'll upload the notebook here

a-r-r-o-w · 2024-07-29T08:55:46Z

I made it public. If you still face issue, I'll upload the notebook here

i can access it now. few suggestions on how to debug:

it says that the dimensions do not match in dim 0. this means it expected a batch size of 3 but only got 2. it is quite easy to see why. when you call prepare_control_image to prepare the controlnet image, it performs torch.cat([image] * 2) if classifier-free guidance is enabled - in this case, it is. for perturbed attention guidance, the expected shape is 3 - which you have done for other embeddings and inputs required
you will have to prepare the control_image accordingly as well. we have many PRs related to PAG open already - all that's required is to thorougly understand the code. please take a good look at: https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/pag/pipeline_pag_controlnet_sd.py#L1193

it is hard for us to help with debugging every small issue because it is just not viable and there isn't enough time. in the previous version of this PR and this PR, many basic changes for PAG were missing. i understand navigating a large codebase could be difficult, but in this case the changes were as simple as doing a diff between old pipelines that we have, and the new PAG pipelines that have been merged - to find all the required changes that need to be made. that said, not all required changes can be pin-pointed like this, otherwise how would someone wanting to learn the codebase or contribute, learn? maybe after applying the change mentioned here, it works, or maybe it does not since something else is missing. there are many other folks opening PAG PRs too who've been successful in integrating it in a short time - so I'd recommend taking a good look at the changes and really understanding it FWIW.

a-r-r-o-w · 2024-08-30T11:11:27Z

Hi @Bhavay-2001, thanks for your work here. I think only a few more changes are required here and it should be good to merge :) Could you address the comment above and apply the relevant changes?

github-actions · 2024-09-29T15:04:16Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions · 2024-10-26T15:04:36Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions · 2024-11-20T15:04:45Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu · 2024-12-03T04:06:04Z

hi @Bhavay-2001
it looks like we are really close to finish:)
let us know if you'll be able to complete it, if not, we can ask others to help:)

Bhavay-2001 and others added 15 commits April 5, 2024 15:35

Create diffusers.yml

11a4491

Merge branch 'huggingface:main' into main

7811185

Merge branch 'huggingface:main' into main

eadf7e8

Merge branch 'huggingface:main' into main

f15e6de

Merge branch 'huggingface:main' into main

b2860e4

Merge branch 'huggingface:main' into main

1d07a99

Added PAG Pipeline for SD_Controlnet_img2img

4a4d355

Updated pag.md

0127897

Updated src/diffusers/__init__.py

a53f3cb

Delete diffusers.yml

dee5c51

Updated pipelines/__init__.py

5a0cd05

:wq

18718a6

Merge branch 'SD_ControlNet_PAG_Img2Img' of https://github.com/Bhavay-2001/diffusers into SD_ControlNet_PAG_Img2Img :wq :wq! !wq :wq! :!wq :q! :wq

Updated auto_pipeline

d4ebdad

Updated pag/__init__

a68f93d

Updated dummy_torch_and_transformers_objects.py

1f5c0e8

Bhavay-2001 changed the title ~~Sd control net pag img2 img~~ add PAG support for SD Controlnet Img2Img Jul 14, 2024

Bhavay-2001 mentioned this pull request Jul 14, 2024

add PAG support for SD Controlnet Img2Img #8810

Closed

6 tasks

a-r-r-o-w reviewed Jul 15, 2024

View reviewed changes

Bhavay-2001 added 2 commits July 15, 2024 23:18

Updated the file

d14cf46

Updated the file

8913567

a-r-r-o-w reviewed Jul 26, 2024

View reviewed changes

src/diffusers/pipelines/pag/pipeline_pag_controlnet_sd_img2img.py Outdated Show resolved Hide resolved

Bhavay-2001 and others added 4 commits July 28, 2024 16:04

Removed callbacks

b4a27f7

Merge branch 'main' into SD_ControlNet_PAG_Img2Img

0002117

Removed callbacks

602991d

Merge branch 'SD_ControlNet_PAG_Img2Img' of https://github.com/Bhavay…

8dda261

…-2001/diffusers into SD_ControlNet_PAG_Img2Img

Tried changes for tests to pass

fae6fea

yiyixuxu added the PAG label Sep 4, 2024

yiyixuxu mentioned this pull request Sep 20, 2024

Add PAG support to SD1.5 #8710

Closed

6 tasks

github-actions bot added the stale Issues that haven't received updates label Sep 29, 2024

yiyixuxu removed the stale Issues that haven't received updates label Sep 30, 2024

github-actions bot added the stale Issues that haven't received updates label Oct 26, 2024

a-r-r-o-w removed the stale Issues that haven't received updates label Oct 27, 2024

Merge branch 'main' into SD_ControlNet_PAG_Img2Img

d6894d8

SahilCarterr mentioned this pull request Nov 10, 2024

PAG for StableDiffusionControlNetImg2ImgPipeline #9886

Open

github-actions bot added the stale Issues that haven't received updates label Nov 20, 2024

yiyixuxu added close-to-merge and removed stale Issues that haven't received updates labels Dec 3, 2024

add PAG support for SD Controlnet Img2Img #8864

Are you sure you want to change the base?

add PAG support for SD Controlnet Img2Img #8864

Uh oh!

Conversation

Bhavay-2001 commented Jul 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Bhavay-2001 commented Jul 15, 2024

Uh oh!

a-r-r-o-w left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Bhavay-2001 commented Jul 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bhavay-2001 commented Jul 25, 2024

Uh oh!

a-r-r-o-w left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

a-r-r-o-w commented Jul 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 28, 2024

Uh oh!

Bhavay-2001 commented Jul 29, 2024

Uh oh!

a-r-r-o-w commented Jul 29, 2024

Uh oh!

Bhavay-2001 commented Jul 29, 2024

Uh oh!

a-r-r-o-w commented Jul 29, 2024

Uh oh!

a-r-r-o-w commented Aug 30, 2024

Uh oh!

github-actions bot commented Sep 29, 2024

Uh oh!

github-actions bot commented Oct 26, 2024

Uh oh!

github-actions bot commented Nov 20, 2024

Uh oh!

yiyixuxu commented Dec 3, 2024

Uh oh!

Uh oh!

Bhavay-2001 commented Jul 14, 2024 •

edited

Loading

a-r-r-o-w left a comment •

edited

Loading

Bhavay-2001 commented Jul 16, 2024 •

edited

Loading

a-r-r-o-w commented Jul 28, 2024 •

edited

Loading