You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
video_reconstruct = diffusion.ddim_sample_loop(
noise=noised_vid_feat,
model=model.eval(),
model_kwargs=model_kwargs,
guide_scale=cfg.guide_scale,
ddim_timesteps=cfg.ddim_timesteps,
eta=0.0)
video_reconstruct = 1. / cfg.scale_factor * video_reconstruct
video_reconstruct = rearrange(video_reconstruct, 'b c f h w -> (b f) c h w')
chunk_size = min(cfg.decoder_bs, video_reconstruct.shape[0])
video_reconstruct_list = torch.chunk(video_reconstruct, video_reconstruct.shape[0]//chunk_size, dim=0)
decode_reconstruct = []
for vd_data in video_reconstruct_list:
gen_frames = autoencoder.decode(vd_data)
decode_reconstruct.append(gen_frames)
video_reconstruct = torch.cat(decode_reconstruct, dim=0)
video_reconstruct = rearrange(video_reconstruct, '(b f) c h w -> b c f h w', b = 1)
save_i2vgen_video_safe(local_path, video_reconstruct.cpu(), captions, cfg.mean, cfg.std, text_size)
the original video is:
However, the reconstruct video is completely collapsed as:
I guess maybe this problem is about the scale_factor, since when I use the cfg.scale_factor=1.0, the result seems better:
Looking forward to your reply, very thanks!
The text was updated successfully, but these errors were encountered:
I have tried DDIM inversion on the modelscope T2V, but get some abnormal results.
I apply as follows:
1) get the original video latent:
2)obtain the noise latent by DDIM inversion:
3)reconstruct the original video:
the original video is:
However, the reconstruct video is completely collapsed as:
I guess maybe this problem is about the scale_factor, since when I use the cfg.scale_factor=1.0, the result seems better:
Looking forward to your reply, very thanks!
The text was updated successfully, but these errors were encountered: