Skip to content

Add Multi-ControlNet pipeline #2556

Closed
Closed
@takuma104

Description

@takuma104

Discussed in #2331. I created a PoC that supports multiple ControlNets, called Multi-ControlNet, based on the StableDiffusionControlNetPipeline. Any feedback is appreciated.
https://github.com/takuma104/diffusers/tree/multi_controlnet

The idea of using multiple ControlNets and adding their outputs together was proposed in #2331, but at the time, the name Multi-ControlNet was not common. For other implementations, I'm referring to @Mikubill 's pioneering work in this field, sd-webui-controlnet.

Currently, I'm considering opening a PR as a community pipeline and have placed the files in example/community.

The difference from pipeline_stable_diffusion_controlnet.py is as follows.
takuma104/diffusers@1b0f135...multi_controlnet

Modification points:

  • Created a new ControlNetProcessor class and made it so that one is specified for each ControlNet processing. Image preprocessing was also moved here.
  • Made it so that controlnet is not specified in the Pipeline constructor.
  • Made it possible to specify multiple ControlNetProcessors in pipeline's __call__() method (there is no limit to the number).

Usage Example:

  • Please refer to the main() function in the py file for details.
pipe = StableDiffusionMultiControlNetPipeline.from_pretrained(
	"runwayml/stable-diffusion-v1-5", safety_checker=None, torch_dtype=torch.float16
).to("cuda")
pipe.enable_xformers_memory_efficient_attention()

controlnet_canny = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16).to("cuda")
controlnet_pose = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-openpose", torch_dtype=torch.float16).to("cuda")

canny_left = load_image("https://huggingface.co/takuma104/controlnet_dev/resolve/main/vermeer_left.png")
pose_right = load_image("https://huggingface.co/takuma104/controlnet_dev/resolve/main/pose_right.png")

image = pipe(
        prompt="best quality, extremely detailed",
        negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
        processors=[
            ControlNetProcessor(controlnet_canny, canny_left),
            ControlNetProcessor(controlnet_pose, pose_right),
        ],
        generator=torch.Generator(device="cpu").manual_seed(0),
        num_inference_steps=30,
        width=512,
        height=512,
).images[0]
image.save("canny_left_right.png")

Generate Example:

Control Image1 Control Image2 Generated
(none)
(none)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions