Skip to content

feat: support cache for LongCat-Image#602

Merged
DefTruth merged 11 commits intovipshop:mainfrom
e1ijah1:feat/longcat-image
Dec 23, 2025
Merged

feat: support cache for LongCat-Image#602
DefTruth merged 11 commits intovipshop:mainfrom
e1ijah1:feat/longcat-image

Conversation

@e1ijah1
Copy link
Contributor

@e1ijah1 e1ijah1 commented Dec 22, 2025

Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
@DefTruth
Copy link
Member

@e1ijah1 Hi~ Can you share the image results w/ or w/o cache?

@DefTruth
Copy link
Member

Thanks for your contribution!

Signed-off-by: elijah <f1renze.142857@gmail.com>
@e1ijah1
Copy link
Contributor Author

e1ijah1 commented Dec 22, 2025

@e1ijah1 Hi~ Can you share the image results w/ or w/o cache?

Below is the longcat_image_edit.1024x1024.C0_Q0_NONE.png generated by python3 generate.py generate longcat_image_edit
longcat_image_edit 1024x1024 C0_Q0_NONE

Summary:

INFO 12-22 06:37:00 [base.py:557] ----------------------------------------------------------------------------------------------------
INFO 12-22 06:37:00 [base.py:342] 🤖 Example Init Config Summary:
INFO 12-22 06:37:00 [base.py:360] - Model: /data/LongCat-Image-Edit/ + meituan-longcat/LongCat-Image-Edit
INFO 12-22 06:37:00 [base.py:360] - Task Type: IE2I - Image Editing to Image
INFO 12-22 06:37:00 [base.py:360] - Torch Dtype: torch.bfloat16
INFO 12-22 06:37:00 [base.py:360] - LoRA Weights: None
INFO 12-22 06:37:00 [base.py:196] 🤖 Example Input Summary:
INFO 12-22 06:37:00 [base.py:196] - prompt: Turn the cat into a dog
INFO 12-22 06:37:00 [base.py:196] - negative_prompt:
INFO 12-22 06:37:00 [base.py:196] - guidance_scale: 4.5
INFO 12-22 06:37:00 [base.py:196] - num_inference_steps: 50
INFO 12-22 06:37:00 [base.py:196] - image: Single Image (1024x1024)
INFO 12-22 06:37:00 [base.py:196] - generator: device cpu, seed 0
INFO 12-22 06:37:00 [base.py:259] 🤖 Example Output Summary:
INFO 12-22 06:37:00 [base.py:270] - Model: longcat_image_edit
INFO 12-22 06:37:00 [base.py:270] - Optimization: C0_Q0_NONE
INFO 12-22 06:37:00 [base.py:270] - Load Time: 0.99s
INFO 12-22 06:37:00 [base.py:270] - Warmup Time: 64.54s
INFO 12-22 06:37:00 [base.py:270] - Inference Time: 54.15s
INFO 12-22 06:37:01 [base.py:227] Image saved to longcat_image_edit.1024x1024.C0_Q0_NONE.png
INFO 12-22 06:37:01 [base.py:568] ----------------------------------------------------------------------------------------------------

Below is the longcat_image_edit.1024x1024.C0_Q0_DBCache_F1B0_W8I1M0MC3_R0.24_CFG1_T0O0_S32.png generated by python3 generate.py generate longcat_image_edit --cache

longcat_image_edit 1024x1024 C0_Q0_DBCache_F1B0_W8I1M0MC3_R0 24_CFG1_T0O0_S32

Summary:

INFO 12-22 06:43:32 [base.py:557] ----------------------------------------------------------------------------------------------------
INFO 12-22 06:43:32 [base.py:342] 🤖 Example Init Config Summary:
INFO 12-22 06:43:32 [base.py:360] - Model: /data/LongCat-Image-Edit/ + meituan-longcat/LongCat-Image-Edit
INFO 12-22 06:43:32 [base.py:360] - Task Type: IE2I - Image Editing to Image
INFO 12-22 06:43:32 [base.py:360] - Torch Dtype: torch.bfloat16
INFO 12-22 06:43:32 [base.py:360] - LoRA Weights: None
INFO 12-22 06:43:32 [base.py:196] 🤖 Example Input Summary:
INFO 12-22 06:43:32 [base.py:196] - prompt: Turn the cat into a dog
INFO 12-22 06:43:32 [base.py:196] - negative_prompt:
INFO 12-22 06:43:32 [base.py:196] - guidance_scale: 4.5
INFO 12-22 06:43:32 [base.py:196] - num_inference_steps: 50
INFO 12-22 06:43:32 [base.py:196] - image: Single Image (1024x1024)
INFO 12-22 06:43:32 [base.py:196] - generator: device cpu, seed 0
INFO 12-22 06:43:32 [base.py:259] 🤖 Example Output Summary:
INFO 12-22 06:43:32 [base.py:270] - Model: longcat_image_edit
INFO 12-22 06:43:32 [base.py:270] - Optimization: C0_Q0_DBCache_F1B0_W8I1M0MC3_R0.24_CFG1_T0O0_S32
INFO 12-22 06:43:32 [base.py:270] - Load Time: 0.99s
INFO 12-22 06:43:32 [base.py:270] - Warmup Time: 35.16s
INFO 12-22 06:43:32 [base.py:270] - Inference Time: 22.47s
INFO 12-22 06:43:32 [base.py:227] Image saved to longcat_image_edit.1024x1024.C0_Q0_DBCache_F1B0_W8I1M0MC3_R0.24_CFG1_T0O0_S32.png
INFO 12-22 06:43:32 [base.py:568] ----------------------------------------------------------------------------------------------------

BTW, I'm still downloading the LongCat-Image weights. I'll post the generated images here later.

@e1ijah1
Copy link
Contributor Author

e1ijah1 commented Dec 23, 2025

@e1ijah1 Hi~ Can you share the image results w/ or w/o cache?

The image generated with out cache:

longcat_image 1024x1024 C0_Q0_NONE

Summary:

INFO 12-22 18:40:26 [base.py:557] ----------------------------------------------------------------------------------------------------
INFO 12-22 18:40:26 [base.py:342] 🤖 Example Init Config Summary:
INFO 12-22 18:40:26 [base.py:360] - Model: /data/LongCat-Image + meituan-longcat/LongCat-Image
INFO 12-22 18:40:26 [base.py:360] - Task Type: T2I - Text to Image
INFO 12-22 18:40:26 [base.py:360] - Torch Dtype: torch.bfloat16
INFO 12-22 18:40:26 [base.py:360] - LoRA Weights: None
INFO 12-22 18:40:26 [base.py:196] 🤖 Example Input Summary:
INFO 12-22 18:40:26 [base.py:196] - prompt: A young Asian woman wearing a yellow knit sweater paired with a white necklace. Her hands rest on her knees, with a serene expression. The background features a rough brick wall, with warm afternoon sunlight casting upon her, creating a tranquil and cozy atmosphere. The shot uses a medium-distance perspective, highlighting her demeanor and the details of her attire. Soft lighting illuminates her face, emphasizing her facial features and the texture of her accessories, adding depth and warmth to the image. The overall composition is simple and elegant, with the brick wall's texture complementing the interplay of sunlight and shadows, showcasing the character's grace and composure.
INFO 12-22 18:40:26 [base.py:196] - height: 1024
INFO 12-22 18:40:26 [base.py:196] - width: 1024
INFO 12-22 18:40:26 [base.py:196] - guidance_scale: 4.5
INFO 12-22 18:40:26 [base.py:196] - num_inference_steps: 50
INFO 12-22 18:40:26 [base.py:196] - generator: device cpu, seed 0
INFO 12-22 18:40:26 [base.py:259] 🤖 Example Output Summary:
INFO 12-22 18:40:26 [base.py:270] - Model: longcat_image
INFO 12-22 18:40:26 [base.py:270] - Optimization: C0_Q0_NONE
INFO 12-22 18:40:26 [base.py:270] - Load Time: 0.98s
INFO 12-22 18:40:26 [base.py:270] - Warmup Time: 37.35s
INFO 12-22 18:40:26 [base.py:270] - Inference Time: 26.83s
INFO 12-22 18:40:26 [base.py:227] Image saved to longcat_image.1024x1024.C0_Q0_NONE.png
INFO 12-22 18:40:26 [base.py:568] ----------------------------------------------------------------------------------------------------

The image generated with cache:

longcat_image 1024x1024 C0_Q0_DBCache_F1B0_W8I1M0MC3_R0 24_CFG1_T0O0_S31

Summary:

INFO 12-22 18:45:37 [base.py:557] ----------------------------------------------------------------------------------------------------
INFO 12-22 18:45:37 [base.py:342] 🤖 Example Init Config Summary:
INFO 12-22 18:45:37 [base.py:360] - Model: /data/LongCat-Image + meituan-longcat/LongCat-Image
INFO 12-22 18:45:37 [base.py:360] - Task Type: T2I - Text to Image
INFO 12-22 18:45:37 [base.py:360] - Torch Dtype: torch.bfloat16
INFO 12-22 18:45:37 [base.py:360] - LoRA Weights: None
INFO 12-22 18:45:37 [base.py:196] 🤖 Example Input Summary:
INFO 12-22 18:45:37 [base.py:196] - prompt: A young Asian woman wearing a yellow knit sweater paired with a white necklace. Her hands rest on her knees, with a serene expression. The background features a rough brick wall, with warm afternoon sunlight casting upon her, creating a tranquil and cozy atmosphere. The shot uses a medium-distance perspective, highlighting her demeanor and the details of her attire. Soft lighting illuminates her face, emphasizing her facial features and the texture of her accessories, adding depth and warmth to the image. The overall composition is simple and elegant, with the brick wall's texture complementing the interplay of sunlight and shadows, showcasing the character's grace and composure.
INFO 12-22 18:45:37 [base.py:196] - height: 1024
INFO 12-22 18:45:37 [base.py:196] - width: 1024
INFO 12-22 18:45:37 [base.py:196] - guidance_scale: 4.5
INFO 12-22 18:45:37 [base.py:196] - num_inference_steps: 50
INFO 12-22 18:45:37 [base.py:196] - generator: device cpu, seed 0
INFO 12-22 18:45:37 [base.py:259] 🤖 Example Output Summary:
INFO 12-22 18:45:37 [base.py:270] - Model: longcat_image
INFO 12-22 18:45:37 [base.py:270] - Optimization: C0_Q0_DBCache_F1B0_W8I1M0MC3_R0.24_CFG1_T0O0_S31
INFO 12-22 18:45:37 [base.py:270] - Load Time: 0.98s
INFO 12-22 18:45:37 [base.py:270] - Warmup Time: 24.92s
INFO 12-22 18:45:37 [base.py:270] - Inference Time: 13.51s
INFO 12-22 18:45:37 [base.py:227] Image saved to longcat_image.1024x1024.C0_Q0_DBCache_F1B0_W8I1M0MC3_R0.24_CFG1_T0O0_S31.png
INFO 12-22 18:45:37 [base.py:568] ----------------------------------------------------------------------------------------------------

Copy link
Member

@DefTruth DefTruth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~ Thanks for your contribution!

@DefTruth DefTruth changed the title feat: support cache for LongCat-Image & LongCat-Image-Edit feat: support cache for LongCat-Image Dec 23, 2025
@DefTruth DefTruth merged commit ec65b19 into vipshop:main Dec 23, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants