Skip to content

Commit

Permalink
Release code
Browse files Browse the repository at this point in the history
  • Loading branch information
wileewang committed Mar 31, 2024
1 parent 93e492c commit 91b94f5
Show file tree
Hide file tree
Showing 41 changed files with 6,802 additions and 6,499 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -159,5 +159,5 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

resources/
scripts/
outputs/
201 changes: 0 additions & 201 deletions LICENSE

This file was deleted.

105 changes: 104 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,104 @@
# MotionInversion
<!-- <p align="center">
<img src="./assets/readme/icon.png" width="250"/>
</p>
<div align="center">
<a href="https://github.com/hpcaitech/Open-Sora/stargazers"><img src="https://img.shields.io/github/stars/hpcaitech/Open-Sora?style=social"></a>
<a href="https://hpcaitech.github.io/Open-Sora/"><img src="https://img.shields.io/badge/Gallery-View-orange?logo=&amp"></a>
<a href="https://discord.gg/kZakZzrSUT"><img src="https://img.shields.io/badge/Discord-join-blueviolet?logo=discord&amp"></a>
<a href="https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-247ipg9fk-KRRYmUl~u2ll2637WRURVA"><img src="https://img.shields.io/badge/Slack-ColossalAI-blueviolet?logo=slack&amp"></a>
<a href="https://twitter.com/yangyou1991/status/1769411544083996787?s=61&t=jT0Dsx2d-MS5vS9rNM5e5g"><img src="https://img.shields.io/badge/Twitter-Discuss-blue?logo=twitter&amp"></a>
<a href="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png"><img src="https://img.shields.io/badge/微信-小助手加群-green?logo=wechat&amp"></a>
<a href="https://hpc-ai.com/blog/open-sora-v1.0"><img src="https://img.shields.io/badge/Open_Sora-Blog-blue"></a>
</div> -->

## Motion Inversion for Video Customization

[Luozhou Wang](https://wileewang.github.io/), [Guibao Shen](), [Yixun Liang](https://yixunliang.github.io/), [Xin Tao](http://www.xtao.website/), Pengfei Wan, Di Zhang, [Yijun Li](https://yijunmaverick.github.io/), [Yingcong Chen](https://www.yingcong.me)

HKUST(GZ), HKUST, Kuaishou Technology, Adobe Research.


we present a novel approach to motion customization in video generation, addressing the widespread gap in the thorough exploration of motion representation within video generative models. Recognizing the unique challenges posed by video's spatiotemporal nature, our method introduces **Motion Embeddings**, a set of explicit, temporally coherent one-dimensional embeddings derived from a given video. These embeddings are designed to integrate seamlessly with the temporal transformer modules of video diffusion models, modulating self-attention computations across frames without compromising spatial integrity. Furthermore, we identify the **Temporal Discrepancy** in video generative models, which refers to variations in how different motion modules process temporal relationships between frames. We leverage this understanding to optimize the integration of our motion embeddings.


<h4>Customizing motion of your video with less than 1m parameters and 10 minutes.</h4>

Your content is generally clear and well-structured. I've made some minor grammatical corrections and clarity improvements:

## 📰 News

* **[2024.03.31]** We have released the project page, arXiv paper, and training code.

## 🚧 Todo List
* [x] Released code for the UNet3D model (ZeroScope, ModelScope).
<!-- * [ ] Release detailed guidance for training and inference.
* [ ] Release Gradio demo. -->
* [ ] Release code for the Sora-like model (Open-Sora, Latte).



## Contents

* [Installation](#installation)
* [Training](#training)
* [Inference](#inference)
* [Acknowledgement](#acknowledgement)
* [Citation](#citation)

<!-- * [Motion Embeddings Hub](#motion-embeddings-hub) -->

## Installation

```bash
# install torch
pip install torch torchvision

# install diffusers and transformers
pip install diffusers==0.26.3 transformers==4.27.4
```


## Training

To start training, first download the [ZeroScope](https://huggingface.co/cerspense/zeroscope_v2_576w) weights and specify the path in the config file. Then, run the following commands to begin training:

```bash
python train.py --config ./configs/train_config.yaml
```

Stay tuned for training other models and advanced usage!

## Inference

```bash
python inference.py --config ./configs/inference_config.yaml
```

We will also provide a Gradio application in this repository.


## Acknowledgement

* [MotionDirector](https://github.com/showlab/MotionDirector): We followed their implementation of loss design and techniques to reduce computation resources.
* [ZeroScope](https://huggingface.co/cerspense/zeroscope_v2_576w): The pretrained video checkpoint we used in our main paper.
* [AnimateDiff](https://github.com/guoyww/animatediff/): The pretrained video checkpoint we used in our main paper.
* [Latte](https://github.com/Vchitect/Latte): A video generation model with a similar architecture to Sora.
* [Open-Sora](https://github.com/hpcaitech/Open-Sora): A video generation model with a similar architecture to Sora.

We are grateful for their exceptional work and generous contribution to the open-source community.

## Citation

<!-- ```bibtex
@software{opensora,
author = {Zangwei Zheng and Xiangyu Peng and Yang You},
title = {Open-Sora: Democratizing Efficient Video Production for All},
month = {March},
year = {2024},
url = {https://github.com/hpcaitech/Open-Sora}
}
``` -->

<!-- ## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=hpcaitech/Open-Sora&type=Date)](https://star-history.com/#hpcaitech/Open-Sora&Date) -->
5 changes: 5 additions & 0 deletions dataset/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from .cached_dataset import CachedDataset
from .image_dataset import ImageDataset
from .single_video_dataset import SingleVideoDataset
from .video_folder_dataset import VideoFolderDataset
from .video_json_dataset import VideoJsonDataset
17 changes: 17 additions & 0 deletions dataset/cached_dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from utils.dataset_utils import *

class CachedDataset(Dataset):
def __init__(self,cache_dir: str = ''):
self.cache_dir = cache_dir
self.cached_data_list = self.get_files_list()

def get_files_list(self):
tensors_list = [f"{self.cache_dir}/{x}" for x in os.listdir(self.cache_dir) if x.endswith('.pt')]
return sorted(tensors_list)

def __len__(self):
return len(self.cached_data_list)

def __getitem__(self, index):
cached_latent = torch.load(self.cached_data_list[index], map_location='cuda:0')
return cached_latent
Loading

0 comments on commit 91b94f5

Please sign in to comment.