A Python toolbox for face pose augmentation based on 3DDFA [1] and the authors' original Matlab code. Comparing to the Matlab version, we fixed few bugs and improved the algorithm to get better correspondence between the 2D landmarks on the original image and the landmarks on the warped image. This tool has been used in our previous work on the topic of pose-invariant lip-reading [2]. We kindly request you to cite both [1] and [2] should you decide to use this tool in your research.
- Numpy:
$pip3 install numpy
- Sciypy:
$pip3 install scipy
- PyTorch:
$pip3 install torch torchvision
- OpenCV:
$pip3 install opencv-python
- cmake:
$pip3 install cmake
- igraph:
$pip3 install igraph
- matplotlib:
$pip3 install matplotlib
- shapely:
$pip3 install shapely
- cython:
$pip3 install cython
- ibug.face_detection: See this repository for details: https://github.com/hhj1897/face_detection.
- ibug.face_alignment: See this repository for details: https://github.com/hhj1897/face_alignment.
git clone --recurse-submodules https://github.com/hhj1897/face_pose_augmentation.git
cd face_pose_augmentation
pip install -r requirements.txt
pip install -e .
Please install ibug.face_detection and ibug.face_alignment before running this test.
python face_pose_augmentation_test.py [-i webcam_index]
Please install ibug.face_detection and ibug.face_alignment before running this test.
python face_pose_augmentation_main.py -i "samples/images" -o "samples/outputs" -y -20
from ibug.face_pose_augmentation import TDDFAPredictor, FacePoseAugmentor
# Instantiate 3DDFA
tddfa = TDDFAPredictor(device='cuda:0')
# Create the face pose augmentor
augmentor = FacePoseAugmentor()
# Fit 3DMM to the face specified by the 68 2D landmarks.
tddfa_result = TDDFAPredictor.decode(tddfa(image, landmarks, rgb=False))[0]
# Perform pose augmentation
# Note:
# 1. delta_poses should be a Nx3 array, each row giving the delta pitch,
# delta yaw, and delta roll of a target pose. This tool should only be
# used to increase the rotation of face, as it cannot hallucinate
# occluded texture if being used to frontalised the face.
# 2. landmarks should be a 68x2 array, storing the coordinates of the 68
# 2D landmarks. This is optional. When this argument is set to None,
# the function will try to infer 2D landmarks from the vertices on the
# 3D mesh.
# 3. This function returns a list of dictionaries, each element storing
# the warping result at a target pose, including the warped image, the
# correspondence map, and landmarks in different styles.
augmentation_results = augmentor(image, tddfa_result, delta_poses, landmarks)
[1] Shiyang Cheng, Pingchuan Ma, Georgios Tzimiropoulos, Stavros Petridis, Adrian Bulat, Jie Shen, and Maja Pantic. "Towards pose-invariant lip-reading." In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020), pp. 4357-4361.
[2] Xiangyu Zhu, Xiaoming Liu, Zhen Lei, and Stan Z. Li. "Face alignment in full pose range: A 3d total solution." IEEE transactions on pattern analysis and machine intelligence 41, no. 1 (2017): 78-92.
[3] Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yan, Zhen Lei, and Stan Z. Li. "Towards Fast, Accurate and Stable 3D Dense Face Alignment." in Proceedings of the European Conference on Computer Vision (ECCV 2020), pp. 152-168.