Closed
Description
Discussed in #2331. I created a PoC that supports multiple ControlNets, called Multi-ControlNet, based on the StableDiffusionControlNetPipeline
. Any feedback is appreciated.
https://github.com/takuma104/diffusers/tree/multi_controlnet
The idea of using multiple ControlNets and adding their outputs together was proposed in #2331, but at the time, the name Multi-ControlNet was not common. For other implementations, I'm referring to @Mikubill 's pioneering work in this field, sd-webui-controlnet
.
Currently, I'm considering opening a PR as a community pipeline and have placed the files in example/community.
The difference from pipeline_stable_diffusion_controlnet.py
is as follows.
takuma104/diffusers@1b0f135...multi_controlnet
Modification points:
- Created a new
ControlNetProcessor
class and made it so that one is specified for each ControlNet processing. Image preprocessing was also moved here. - Made it so that
controlnet
is not specified in the Pipeline constructor. - Made it possible to specify multiple
ControlNetProcessors
in pipeline's__call__()
method (there is no limit to the number).
Usage Example:
- Please refer to the
main()
function in the py file for details.
pipe = StableDiffusionMultiControlNetPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", safety_checker=None, torch_dtype=torch.float16
).to("cuda")
pipe.enable_xformers_memory_efficient_attention()
controlnet_canny = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16).to("cuda")
controlnet_pose = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-openpose", torch_dtype=torch.float16).to("cuda")
canny_left = load_image("https://huggingface.co/takuma104/controlnet_dev/resolve/main/vermeer_left.png")
pose_right = load_image("https://huggingface.co/takuma104/controlnet_dev/resolve/main/pose_right.png")
image = pipe(
prompt="best quality, extremely detailed",
negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
processors=[
ControlNetProcessor(controlnet_canny, canny_left),
ControlNetProcessor(controlnet_pose, pose_right),
],
generator=torch.Generator(device="cpu").manual_seed(0),
num_inference_steps=30,
width=512,
height=512,
).images[0]
image.save("canny_left_right.png")
Generate Example:
- Model: andite/anything-v4.0
- Prompt:
best quality, extremely detailed, cowboy shot
- Negative_prompt:
cowboy, monochrome, lowres, bad anatomy, worst quality, low quality
- Seed: 19 (cherry-picked)
- Pose & Canny Control Image Generated by Character bones that look like Openpose for blender _ Ver_4.7 Depth+Canny
Control Image1 | Control Image2 | Generated |
---|---|---|
(none) | ||
(none) |
Metadata
Metadata
Assignees
Labels
No labels