How to convert the sat version of the i2v model to the diffusers version

> When I tried using the code at https://github.com/THUDM/CogVideo/blob/main/tools/convert_weight_sat2hf.py with modified parameters—based on the comments in the code, since I wanted to convert the sat version of Tora's i2v to the diffusers version—I only changed a few argument defaults: `--num_layers=42`, `--num_attention_heads=48`, `--use_rotary_positional_embeddings=True`, `--scaling_factor=0.7`, `--snr_shift_scale=1.0`, `--i2v=True`, and `--version=1.0` also tried`--version=1.5`.  
> 
> After running the code, I encountered the following errors:  
> 
> - Size mismatch for `patch_embed.proj.weight`: copying a parameter with shape `torch.Size([3072, 32, 2, 2])` from checkpoint while current model expects shape `torch.Size([3072, 128])`.  
> - Size mismatch for `proj_out.weight`: copying a parameter with shape `torch.Size([64, 3072])` from checkpoint while current model expects shape `torch.Size([128, 3072])`.  
> - Size mismatch for `proj_out.bias`: copying a parameter with shape torch.Size([64]) from checkpoint while current model expects torch.Size([128]).  
> 
> A similar issue was also mentioned here: https://github.com/alibaba/Tora/issues/26#issue-2896267677.  
> 
> Could you provide suggestions on how to resolve this problem? Additionally, if converting i2v to diffusers version is necessary for execution—aside from whether the transformer checkpoint (`transformer_ckpt`) needs conversion—do existing VAE checkpoints (`vae_ckpt`) and text encoders also require conversion using this script? Or does Alibaba plan to release an official diffusers-compatible version of i2v?  
> 
> Looking forward to your reply.
> 
> Indeed the weight file of I2V diffusers version is not available. But the i2v_pipeline code is available and diffusers-version/inference.py supports I2V. One needs to convert sat version weights to diffusers version. You can refer to https://github.com/THUDM/CogVideo/blob/main/tools/convert_weight_sat2hf.py in https://github.com/THUDM/CogVideo/#tools to do this.
> 
>  

 _Originally posted by @zenmequmingzia in [#30](https://github.com/alibaba/Tora/issues/30#issuecomment-2754597059)_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to convert the sat version of the i2v model to the diffusers version #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to convert the sat version of the i2v model to the diffusers version #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions