-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f9a0340
Showing
56 changed files
with
5,167 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
tmp | ||
vox1 | ||
vox2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,182 @@ | ||
Copyright © Niantic, Inc. 2018. Patent Pending. | ||
|
||
All rights reserved. | ||
|
||
|
||
|
||
================================================================================ | ||
|
||
|
||
|
||
This Software is licensed under the terms of the following Monodepth2 license | ||
which allows for non-commercial use only. For any other use of the software not | ||
covered by the terms of this license, please contact partnerships@nianticlabs.com | ||
|
||
|
||
|
||
================================================================================ | ||
|
||
|
||
|
||
Monodepth v2 License | ||
|
||
|
||
This Agreement is made by and between the Licensor and the Licensee as | ||
defined and identified below. | ||
|
||
|
||
1. Definitions. | ||
|
||
In this Agreement (“the Agreement”) the following words shall have the | ||
following meanings: | ||
|
||
"Authors" shall mean C. Godard, O. Mac Aodha, M. Firman, G. Brostow | ||
"Licensee" Shall mean the person or organization agreeing to use the | ||
Software in accordance with these terms and conditions. | ||
"Licensor" shall mean Niantic Inc., a company organized and existing under | ||
the laws of Delaware, whose principal place of business is at 1 Ferry Building, | ||
Suite 200, San Francisco, 94111. | ||
"Software" shall mean the MonoDepth v2 Software uploaded by Licensor to the | ||
GitHub repository at https://github.com/nianticlabs/monodepth2 | ||
on May 2nd 2019 in source code or object code form and any | ||
accompanying documentation as well as any modifications or additions uploaded | ||
to the same GitHub repository by Licensor. | ||
|
||
|
||
2. License. | ||
|
||
2.1 The Licensor has all necessary rights to grant a license under: (i) | ||
copyright and rights in the nature of copyright subsisting in the Software; and | ||
(ii) certain patent rights resulting from a patent application filed by the | ||
Licensor in the United States in connection with the Software. The Licensor | ||
grants the Licensee for the duration of this Agreement, a free of charge, | ||
non-sublicenseable, non-exclusive, non-transferable copyright and patent | ||
license (in consequence of said patent application) to use the Software for | ||
non-commercial purpose only, including teaching and research at educational | ||
institutions and research at not-for-profit research institutions in accordance | ||
with the provisions of this Agreement. Non-commercial use expressly excludes | ||
any profit-making or commercial activities, including without limitation sale, | ||
license, manufacture or development of commercial products, use in | ||
commercially-sponsored research, use at a laboratory or other facility owned or | ||
controlled (whether in whole or in part) by a commercial entity, provision of | ||
consulting service, use for or on behalf of any commercial entity, and use in | ||
research where a commercial party obtains rights to research results or any | ||
other benefit. Any use of the Software for any purpose other than | ||
non-commercial research shall automatically terminate this License. | ||
|
||
|
||
2.2 The Licensee is permitted to make modifications to the Software | ||
provided that any distribution of such modifications is in accordance with | ||
Clause 3. | ||
|
||
2.3 Except as expressly permitted by this Agreement and save to the | ||
extent and in the circumstances expressly required to be permitted by law, the | ||
Licensee is not permitted to rent, lease, sell, offer to sell, or loan the | ||
Software or its associated documentation. | ||
|
||
|
||
3. Redistribution and modifications | ||
|
||
3.1 The Licensee may reproduce and distribute copies of the Software, with | ||
or without modifications, in source format only and only to this same GitHub | ||
repository , and provided that any and every distribution is accompanied by an | ||
unmodified copy of this License and that the following copyright notice is | ||
always displayed in an obvious manner: Copyright © Niantic, Inc. 2018. All | ||
rights reserved. | ||
|
||
|
||
3.2 In the case where the Software has been modified, any distribution must | ||
include prominent notices indicating which files have been changed. | ||
|
||
3.3 The Licensee shall cause any work that it distributes or publishes, | ||
that in whole or in part contains or is derived from the Software or any part | ||
thereof (“Work based on the Software”), to be licensed as a whole at no charge | ||
to all third parties entitled to a license to the Software under the terms of | ||
this License and on the same terms provided in this License. | ||
|
||
|
||
4. Duration. | ||
|
||
This Agreement is effective until the Licensee terminates it by destroying | ||
the Software, any Work based on the Software, and its documentation together | ||
with all copies. It will also terminate automatically if the Licensee fails to | ||
abide by its terms. Upon automatic termination the Licensee agrees to destroy | ||
all copies of the Software, Work based on the Software, and its documentation. | ||
|
||
|
||
5. Disclaimer of Warranties. | ||
|
||
The Software is provided as is. To the maximum extent permitted by law, | ||
Licensor provides no warranties or conditions of any kind, either express or | ||
implied, including without limitation, any warranties or condition of title, | ||
non-infringement or fitness for a particular purpose. | ||
|
||
|
||
6. LIMITATION OF LIABILITY. | ||
|
||
IN NO EVENT SHALL THE LICENSOR AND/OR AUTHORS BE LIABLE FOR ANY DIRECT, | ||
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING | ||
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | ||
DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF | ||
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE | ||
OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF | ||
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
|
||
7. Indemnity. | ||
|
||
The Licensee shall indemnify the Licensor and/or Authors against all third | ||
party claims that may be asserted against or suffered by the Licensor and/or | ||
Authors and which relate to use of the Software by the Licensee. | ||
|
||
|
||
8. Intellectual Property. | ||
|
||
8.1 As between the Licensee and Licensor, copyright and all other | ||
intellectual property rights subsisting in or in connection with the Software | ||
and supporting information shall remain at all times the property of the | ||
Licensor. The Licensee shall acquire no rights in any such material except as | ||
expressly provided in this Agreement. | ||
|
||
8.2 No permission is granted to use the trademarks or product names of the | ||
Licensor except as required for reasonable and customary use in describing the | ||
origin of the Software and for the purposes of abiding by the terms of Clause | ||
3.1. | ||
|
||
8.3 The Licensee shall promptly notify the Licensor of any improvement or | ||
new use of the Software (“Improvements”) in sufficient detail for Licensor to | ||
evaluate the Improvements. The Licensee hereby grants the Licensor and its | ||
affiliates a non-exclusive, fully paid-up, royalty-free, irrevocable and | ||
perpetual license to all Improvements for non-commercial academic research and | ||
teaching purposes upon creation of such improvements. | ||
|
||
8.4 The Licensee grants an exclusive first option to the Licensor to be | ||
exercised by the Licensor within three (3) years of the date of notification of | ||
an Improvement under Clause 8.3 to use any the Improvement for commercial | ||
purposes on terms to be negotiated and agreed by Licensee and Licensor in good | ||
faith within a period of six (6) months from the date of exercise of the said | ||
option (including without limitation any royalty share in net income from such | ||
commercialization payable to the Licensee, as the case may be). | ||
|
||
|
||
9. Acknowledgements. | ||
|
||
The Licensee shall acknowledge the Authors and use of the Software in the | ||
publication of any work that uses, or results that are achieved through, the | ||
use of the Software. The following citation shall be included in the | ||
acknowledgement: “Digging Into Self-Supervised Monocular Depth Estimation, | ||
by C. Godard, O. Mac Aodha, M. Firman, G. Brostow, arXiv:1806.01260”. | ||
|
||
|
||
10. Governing Law. | ||
|
||
This Agreement shall be governed by, construed and interpreted in | ||
accordance with English law and the parties submit to the exclusive | ||
jurisdiction of the English courts. | ||
|
||
|
||
11. Termination. | ||
|
||
Upon termination of this Agreement, the licenses granted hereunder will | ||
terminate and Sections 5, 6, 7, 8, 9, 10 and 11 shall survive any termination | ||
of this Agreement. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
|
||
## :book: The Face Depth Network of ``Depth-Aware Generative Adversarial Network for Talking Head Video Generation'' (CVPR 2022) | ||
<p align="center"> | ||
<small>:fire: If DaGAN is helpful in your photos/projects, please help to :star: it or recommend it to your friends. Thanks:fire:</small> | ||
</p> | ||
|
||
> [[Paper](https://arxiv.org/abs/2203.06605)]   [[Project Page](https://harlanhong.github.io/publications/dagan.html)]   [[Demo](https://huggingface.co/spaces/HarlanHong/DaGAN)]   [[Poster Video](https://www.youtube.com/watch?v=nahsJNjWzGo&t=1s)]<br> | ||
<!-- > [Fa-Ting Hong](https://harlanhong.github.io), [Longhao Zhang](https://dblp.org/pid/236/7382.html), [Li Shen](https://scholar.google.co.uk/citations?user=ABbCaxsAAAAJ&hl=en), [Dan Xu](https://www.danxurgb.net) <br> --> | ||
<!-- > The Hong Kong University of Science and Technology, Alibaba Cloud --> | ||
> [Fa-Ting Hong](https://harlanhong.github.io), [Longhao Zhang](), [Li Shen](), [Dan Xu](https://www.danxurgb.net) <br> | ||
> The Hong Kong University of Science and Technology | ||
### Cartoon Sample | ||
https://user-images.githubusercontent.com/19970321/162151632-0195292f-30b8-4122-8afd-9b1698f1e4fe.mp4 | ||
|
||
### Human Sample | ||
https://user-images.githubusercontent.com/19970321/162151327-f2930231-42e3-40f2-bfca-a88529599f0f.mp4 | ||
|
||
### Image Dataset | ||
<p align="center"> | ||
<img src="assets/pointcloud.jpg"> | ||
</p> | ||
|
||
|
||
|
||
## :wrench: Dependencies and Installation | ||
|
||
- Python >= 3.7 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) | ||
- [PyTorch >= 1.7](https://pytorch.org/) | ||
- Option: NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads) | ||
- Option: Linux | ||
|
||
### ⚙️ Setup | ||
|
||
1. Clone repo | ||
|
||
```bash | ||
git clone https://github.com/harlanhong/DaGAN-Head.git | ||
cd CVPR2022-Head | ||
``` | ||
|
||
2. Install dependent packages | ||
|
||
```bash | ||
conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch | ||
pip install tensorboardX==1.4 | ||
conda install opencv=3.3.1 # just needed for evaluation | ||
``` | ||
Or you can use the environment of [DaGAN](https://github.com/harlanhong/CVPR2022-DaGAN) directly | ||
## :zap: Quick Inference | ||
|
||
### Pre-trained checkpoint | ||
The pre-trained checkpoint of face depth network and our DaGAN checkpoints can be found under following link: [OneDrive](https://hkustconnect-my.sharepoint.com/:f:/g/personal/fhongac_connect_ust_hk/EjfeXuzwo3JMn7s0oOPN_q0B81P5Wgu_kbYJAh7uSAKS2w?e=KaQcPk). | ||
|
||
**Inference!** | ||
To run a demo, download checkpoint and run the following command to predict scaled disparity for a single image with: | ||
|
||
|
||
```shell | ||
python test_simple.py --image_path assets/test_image.jpg --model_name tmp/You_Model/models/weights_19 | ||
``` | ||
|
||
## :computer: Training | ||
|
||
|
||
### Datasets | ||
|
||
1) **VoxCeleb**. Please follow the instruction from https://github.com/AliaksandrSiarohin/video-preprocessing. | ||
|
||
### Train on VoxCeleb | ||
To train a model on specific dataset run: | ||
``` | ||
CUDA_VISIBLE_DEVICES=0 python train.py --batch_size 2 --heigh 256 --width 256 --dataset vox --sample_num 100000 --model_name taking_head_10w --data_path vox2 | ||
|
||
``` | ||
### Training on your own dataset | ||
You can train on a custom monocular or stereo dataset by writing a new dataloader class which inherits from `MonoDataset` – see the `CELEBDataset` class in `datasets/celeb_dataset.py` for an example. | ||
## ⏳ Training | ||
By default models and tensorboard event files are saved to `~/tmp/<model_name>`. | ||
This can be changed with the `--log_dir` flag. | ||
**Monocular training:** | ||
```shell | ||
CUDA_VISIBLE_DEVICES=1 python train.py --model_name mono_model_in --png --batch_size 12 | ||
``` | ||
|
||
|
||
|
||
## :scroll: Acknowledgement | ||
|
||
Our Face-Depth-Network implementation is borrowed from [Monodepth2](https://github.com/nianticlabs/monodepth2). We appreciate the authors of Monodepth2 for making their codes available to public. | ||
|
||
## :scroll: BibTeX | ||
|
||
``` | ||
@inproceedings{hong2022depth, | ||
title={Depth-Aware Generative Adversarial Network for Talking Head Video Generation}, | ||
author={Hong, Fa-Ting and Zhang, Longhao and Shen, Li and Xu, Dan}, | ||
journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, | ||
year={2022} | ||
} | ||
``` | ||
|
||
### :e-mail: Contact | ||
|
||
If you have any question, please email `fhongac@cse.ust.hk`. |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
import glob | ||
import numpy as np | ||
import os | ||
import pdb | ||
import random | ||
def produce_txt_list(): | ||
videos = glob.glob('/data/fhongac/origDataset/Voxceleb2/vox2_train_frames/mp4/*/*/*') | ||
train_num = 0 | ||
test_num = 0 | ||
trainfile = open('./celeb2/train_files.txt', 'w', encoding='UTF-8') | ||
train = True | ||
testfile = open('./celeb2/test_file.txt', 'w', encoding='UTF-8') | ||
for v in videos: | ||
print(v) | ||
if train: | ||
imgs = os.listdir(v) | ||
num = len(imgs) | ||
if num<=2: | ||
continue | ||
imglist = range(2,num) | ||
choice = random.sample(imglist,int(num/3)) | ||
for ch in choice: | ||
string = v+' '+str(ch)+'\n' | ||
trainfile.write(string) | ||
train_num+=1 | ||
else: | ||
imgs = os.listdir(v) | ||
num = len(imgs) | ||
if num<=2: | ||
continue | ||
imglist = range(2,num) | ||
choice = random.sample(imglist,int(num/3)) | ||
for ch in choice: | ||
string = v+' '+str(ch)+'\n' | ||
testfile.write(string) | ||
test_num+=1 | ||
if test_num>1000: | ||
testfile.close() | ||
exit(0) | ||
trainfile.close() | ||
|
||
def produce_txt_list_celeb1(): | ||
train_num = 0 | ||
test_num = 0 | ||
train = True | ||
if train: | ||
videos = glob.glob('/data/fhongac/origDataset/vox1_frames/train/*') | ||
trainfile = open('./celeb1/train_files.txt', 'w', encoding='UTF-8') | ||
else: | ||
videos = glob.glob('/data/fhongac/origDataset/vox1_frames/test/*') | ||
testfile = open('./celeb1/val_files.txt', 'w', encoding='UTF-8') | ||
for v in videos: | ||
print(v) | ||
if train: | ||
imgs = os.listdir(v) | ||
num = len(imgs) | ||
if num<=2: | ||
continue | ||
imglist = range(2,num-1) | ||
choice = random.sample(imglist,int(num/3)) | ||
for ch in choice: | ||
string = v+' '+str(ch)+'\n' | ||
trainfile.write(string) | ||
train_num+=1 | ||
else: | ||
imgs = os.listdir(v) | ||
num = len(imgs) | ||
if num<=2: | ||
continue | ||
imglist = range(2,num-1) | ||
choice = random.sample(imglist,int(num/3)) | ||
for ch in choice: | ||
string = v+' '+str(ch)+'\n' | ||
testfile.write(string) | ||
test_num+=1 | ||
if test_num>1000: | ||
testfile.close() | ||
exit(0) | ||
trainfile.close() | ||
|
||
if __name__ == '__main__': | ||
produce_txt_list_celeb1() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
from .celeb_dataset import CELEBRAWDataset,CELEBFlipRAWDataset |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.