[Kolors] Add IP Adapter #8901

asomoza · 2024-07-19T02:02:30Z

What does this PR do?

Add the new released IP Adapter to the Kolors Pipelines

Note: I'll add the img2img after the initial review and after #8856 is merged.

How to test

T2I

import torch
from transformers import CLIPVisionModelWithProjection

from diffusers import DPMSolverMultistepScheduler, KolorsPipeline
from diffusers.utils import load_image

image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "Kwai-Kolors/Kolors-IP-Adapter-Plus",
    subfolder="image_encoder",
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    revision="refs/pr/4",
)

pipe = KolorsPipeline.from_pretrained(
    "Kwai-Kolors/Kolors-diffusers", image_encoder=image_encoder, torch_dtype=torch.float16, variant="fp16"
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)

pipe.load_ip_adapter(
    "Kwai-Kolors/Kolors-IP-Adapter-Plus",
    subfolder="",
    weight_name="ip_adapter_plus_general.safetensors",
    revision="refs/pr/4",
    image_encoder_folder=None,
)
pipe.enable_model_cpu_offload()

ipa_image = load_image("https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/kolors/cat_square.png")

image = pipe(
    prompt="best quality, high quality",
    negative_prompt="",
    guidance_scale=6.5,
    num_inference_steps=25,
    ip_adapter_image=ipa_image,
).images[0]

image.save("kolors_ipa_result.png")

source	result 1	result 2

IMG2IMG

import math

import torch
from transformers import CLIPVisionModelWithProjection

from diffusers import DPMSolverMultistepScheduler, KolorsImg2ImgPipeline
from diffusers.utils import load_image

image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "Kwai-Kolors/Kolors-IP-Adapter-Plus",
    subfolder="image_encoder",
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    revision="refs/pr/4",
)

pipe = KolorsImg2ImgPipeline.from_pretrained(
    "Kwai-Kolors/Kolors-diffusers", image_encoder=image_encoder, torch_dtype=torch.float16, variant="fp16"
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)

pipe.load_ip_adapter(
    "Kwai-Kolors/Kolors-IP-Adapter-Plus",
    subfolder="",
    weight_name="ip_adapter_plus_general.safetensors",
    revision="refs/pr/4",
    image_encoder_folder=None,
)
pipe.set_ip_adapter_scale(0.4)

pipe.enable_model_cpu_offload()

source_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/kolors/capyrabbit.png?download=true"
)
ipa_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/kolors/ip_image.png?download=true"
)

prompt = "a capybara wearing sunglasses. In the background of the image there are trees, poles, grass and other objects. At the bottom of the object there is the road., 8k, highly detailed."

strength = 0.5
steps = 25
num_inference_steps = math.ceil(steps / strength)

image = pipe(
    prompt=prompt,
    image=source_image,
    negative_prompt="",
    guidance_scale=6.5,
    num_inference_steps=num_inference_steps,
    strength=strength,
    ip_adapter_image=ipa_image,
).images[0]

image.save("kolors_img2img_ipa_result.png")

original	ip image	result	style

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu

HuggingFaceDocBuilderDev · 2024-07-19T02:11:44Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

also, don't forget to set it back in unload_ip_adapter

diffusers/src/diffusers/loaders/ip_adapter.py

Line 300 in a785992

def unload_ip_adapter(self):

src/diffusers/loaders/ip_adapter.py

src/diffusers/loaders/unet.py

yiyixuxu

thank you!

asomoza · 2024-07-26T11:01:28Z

@stevhliu can you please review the documentation.

stevhliu

Very nice! Just a few comments to improve clarity 😄

docs/source/en/api/pipelines/kolors.md

e1ijah1 · 2024-10-10T08:24:45Z

Hello, and thank you for this excellent PR! I'm currently working with a model that's encountering memory issues when loading the IP-Adapter on a single GPU. I noticed that the load_ip_adapter method doesn't seem to support specifying different devices directly.
I'm wondering if you could provide some guidance on how to effectively utilize multiple GPUs when loading the model and IP-Adapter?

asomoza · 2024-10-10T11:55:29Z

AFAIK we don't have a method of separating the IP Adapter from the model device, what you can do is to get the image embeddings before inference.

Also, you can split the pipeline modules in different devices too, this guide should be applicable for this use case too.

You can even get the text embeddings separately as shown in that guide, so actually the most important part here would be to free the VRAM of the text encoder than separating the IP Adapter into another device.

* initial draft * apply suggestions * fix failing test * added ipa to img2img * add docs * apply suggestions

initial draft

4bc752b

asomoza requested a review from yiyixuxu July 19, 2024 02:05

Merge branch 'main' into kolors-ip-adapter

7a3feaf

yiyixuxu reviewed Jul 20, 2024

View reviewed changes

src/diffusers/loaders/ip_adapter.py Show resolved Hide resolved

src/diffusers/loaders/unet.py Outdated Show resolved Hide resolved

src/diffusers/loaders/unet.py Outdated Show resolved Hide resolved

asomoza and others added 5 commits July 20, 2024 00:29

Merge branch 'main' into kolors-ip-adapter

fa86c54

apply suggestions

ee31c9e

Merge branch 'main' into kolors-ip-adapter

bab4156

fix failing test

75f66a5

added ipa to img2img

a9d2b10

asomoza requested a review from yiyixuxu July 20, 2024 07:24

Merge branch 'main' into kolors-ip-adapter

4092b6c

yiyixuxu approved these changes Jul 23, 2024

View reviewed changes

asomoza and others added 3 commits July 23, 2024 01:01

Merge branch 'main' into kolors-ip-adapter

9f8fbc4

Merge branch 'main' into kolors-ip-adapter

c045780

add docs

57453e4

asomoza requested a review from stevhliu July 26, 2024 11:00

stevhliu approved these changes Jul 26, 2024

View reviewed changes

asomoza and others added 3 commits July 26, 2024 13:40

apply suggestions

5f43f1e

Merge branch 'main' into kolors-ip-adapter

f68ee2a

Merge branch 'main' into kolors-ip-adapter

0507e4f

asomoza merged commit 73acebb into huggingface:main Jul 26, 2024
15 checks passed

asomoza deleted the kolors-ip-adapter branch July 26, 2024 18:25

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

[Kolors] Add IP Adapter (#8901)

edddf3d

* initial draft * apply suggestions * fix failing test * added ipa to img2img * add docs * apply suggestions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Kolors] Add IP Adapter #8901

[Kolors] Add IP Adapter #8901

Uh oh!

asomoza commented Jul 19, 2024 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jul 19, 2024

Uh oh!

yiyixuxu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiyixuxu left a comment

Uh oh!

asomoza commented Jul 26, 2024

Uh oh!

stevhliu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

e1ijah1 commented Oct 10, 2024

Uh oh!

asomoza commented Oct 10, 2024

Uh oh!

Uh oh!

[Kolors] Add IP Adapter #8901

[Kolors] Add IP Adapter #8901

Uh oh!

Conversation

asomoza commented Jul 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

How to test

T2I

IMG2IMG

Uh oh!

HuggingFaceDocBuilderDev commented Jul 19, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

asomoza commented Jul 26, 2024

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

e1ijah1 commented Oct 10, 2024

Uh oh!

asomoza commented Oct 10, 2024

Uh oh!

Uh oh!

asomoza commented Jul 19, 2024 •

edited

Loading