-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
System Info / 系統信息
diffusers 0.32.0.dev0
torch 2.4.1+cu121
python 3.10.14
Information / 问题信息
- The official example scripts / 官方的示例脚本
- My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
I tried using CogVideoX1.5-5B-I2V and CogVideoX-5B-I2V based on CogVideoXImageToVideoPipeline(diffusers).
For CogVideoX-5B-I2V, width= 720, height = 480, num_frames = 49, num_inference_steps = 50.
For CogVideoX1.5-5B-I2V, width=1360, height=768, num_frames = 77, num_inference_steps = 50.
The generated videos of CogVideoX-5B-I2V are good.
In the generated videos of CogVideoX1.5-5B-I2V, the brightness of the first few frames is inconsistent with the images, and the latter part of the video exhibits blurriness and temporal inconsistency.
The result of CogVideoX-5B-I2V:
A.man.walking.in.the.road._480_720_49_1.0_output.mp4
The result of CogVideoX1.5-5B-I2V:
A.man.walking.in.the.road._768_1360_77_output.mp4
Expected behavior / 期待表现
The brightness of videos generated byCogVideoX1.5-5B-I2V is consistent with the images.
