Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inpainting affects non-transparent parts of the image #211

Open
SaladDays831 opened this issue Jul 17, 2023 · 18 comments
Open

Inpainting affects non-transparent parts of the image #211

SaladDays831 opened this issue Jul 17, 2023 · 18 comments

Comments

@SaladDays831
Copy link

Hi! :)
I'm testing the new inpainting functionality that has recently been pushed to the main branch.

I'm using the Stable Diffusion 1.5 model converted with this command:

python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-vae-encoder --convert-safety-checker --model-version "runwayml/stable-diffusion-v1-5" --unet-support-controlnet --quantize-nbits 6 --attention-implementation SPLIT_EINSUM_V2 --convert-controlnet "lllyasviel/sd-controlnet-canny" --bundle-resources-for-swift-cli -o "/path/to/save"

and the already converted InPaint-SE model from here.

I'm also using macOS Preview to erase all the image content except my face to transparent, like so:

The resulting image kinda uses my face, but messes it up, while I was expecting the face to remain unchanged.

This is happening on iPadOS using the main branch of this package, and also on the latest version of MochiDiffusion.

I don't think that it's intended. In Automatic1111, when using InPaint + Canny I get good results where the face remains unchanged
.

@ynagatomo
Copy link

That's strange. In my experiment, it worked fine.
Screenshot 2023-07-17 at 18 33 58

@SaladDays831
Copy link
Author

Hmm, thanks @ynagatomo, will try to convert the same inpaint model you use myself

@SaladDays831
Copy link
Author

No changes with the newly converted model
Tested with just inpainting the face (instead of everything but the face) - it works ok-ish. I still get some noise/corruption outside the inpainted area (your example also has some minor color changes). Maybe it's not that visible in your example because it's not a photo but a painting? 🤔

When using the Automatic1111 WebUI and inpainting everything except the face (like in my example) - the face remains unchanged. In cases like these, even the slightest deformation on the person's face will result in a total mess :(

@ynagatomo
Copy link

at least, the masking feature for InPainting added by the PR is working. We may need to adjust the parameters and models. :)

@jrittvo
Copy link

jrittvo commented Jul 17, 2023

I think the process may be sensitive to the base model being used, for some reason. When I use a given base model to generate the input image, and then that same base model (and the same seed when possible) for the ControlNet inpaint run, I get many fewer anomalies. I don't understand why that could be, but it seems to be that way for me.

@atiorh
Copy link
Collaborator

atiorh commented Jul 18, 2023

Hey @SaladDays831! I checked out A1111's in-painting UI after seeing this issue. There are a lot of additional knobs that are built around the core in-painting functionality in order to make it work better for certain use cases. Some examples for these knobs are:

  • Masked only vs whole picture mode (masked only zooms into the region to preserve details better)
  • Mask blur (for blending)
  • Mask padding (dilation)
    None of this is implemented in our ControlNet support today but I expect we will gradually support some of this through PRs.

@SaladDays831
Copy link
Author

Hi @atiorh :)
Thanks for looking into this!

I didn't thoroughly test the difference, but there are two ways to do inpainting in A1111. All the settings you mentioned are present in the img2img -> inpaint tab (and you don't need a CN model for that from what I see)

For my tests, I just used the imported inpainting model in the ControlNet section of the txt2img tab, which looks like the "core" inpainting functionality. It doesn't have all these fancy settings + I can test the same model version I try to use with this package, and it works as expected (doesn't change the un-inpainted parts at all)

@TimYao18
Copy link

TimYao18 commented Aug 2, 2023

Hi, I tried to add a Starting Image in Inpaint with SD1.5_cn, but it seems to have no effect and does not influence the resulting output image. I'm not sure if this is the correct behavior.

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

What commands or app are you using. You need to provide some details here before anyone can begin to help. Does your starting image have an area that is transparent to indicate what area is to be inpainted?

@TimYao18
Copy link

TimYao18 commented Aug 2, 2023

I use swift diffusers and MochiDiffusion both. I just tried the Swift CLI and the starting image has no effect to the result, too.

Perhaps I didn't make myself clear. What I meant is that the results remain the same whether I include the Starting Image or not.

The images are as below.
starting image
masked image as controlnet input

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

At the moment the InPaint ControlNet is broken in Mochi Diffusion. At least half the time, it is ignoring the masked input image. I have a build that appears to fix the problem, but I don't know if my builds can run on other people's machines because it is not an Apple notarized app. If you would like to try it, this is the download link: https://huggingface.co/jrrjrr/Playground/blob/main/Mochi%20Diffusion%20(macOS%2013).dmg

When you say "Starting Image", does that mean that you trying to use 2 images? A masked image to define the inpaint area and a second image that you want to have fill the masked area? Can you explain a little more how you are setting it all up in either of your two methods?

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

Swift CLI, for ControlNet InPaint, it only uses the --controlnet-inputs. You can't also use the --image argument. The --image argument is for Image2Image.

This is the command I use (with my paths) for ControlNet:

swift run StableDiffusionSample "a photo of a cat" --seed 12 --guidance-scale 8.0 --step-count 24 --image-count 1 --scheduler dpmpp --compute-units cpuAndGPU --resource-path ../models/sd-5x7 --controlnet InPaint-5x7 --controlnet-inputs ../input/cat-5x7.png --output-path ../images

@TimYao18
Copy link

TimYao18 commented Aug 2, 2023

I set 2 images as the MochiDiffusion screenshot here

The starting image is defined in the PipelineConfiguration:
/// Starting image for image2image or in-painting
public var startingImage: CGImage? = nil

I don't know if inpaint need the starting image, and I think inpaint might reference to the starting image to fix something.

I apologize for causing some confusion.

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

ControlNet InPaint in Mochi only uses one input image. The masked image. The text prompt tells what to put in the masked area. The upper spot for an input image in Mochi only gets used with Image2Image. It has no effect on ControlNet.

This is with my test build. Remember, the build that downloads from the Mochi GitHub is presently broken for most ControlNets.

Screencap

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

And yes, this is all very confusing because it is not explained well with visual examples anywhere. That is something Mochi needs to improve on.

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

In this example, everything is masked except the face. The text prompt tells to use a "suit of armor" where there is mask.

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

Masked image
mask-blouse-5x5

Prompt: Woman in flower print blouse
Woman with flower print blouse 10 1371478925
Woman with flower print blouse 12 1371478927

@jrittvo
Copy link

jrittvo commented Aug 2, 2023

When I have used it in Swift CLI, it is the same inputs and logic. The python CLI pipeline may be different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants