1-) Tensorflowtts models-: to create speech from text, you can find lib in this link but I recommend using my version of tensorflowtts and replacing any git clone with my link as it solves a couple of problems link
2-) Thin-Plate Spline Motion Model: to adapt the image to video motion leading to having any motion you want on the outcome video, you can find lib in this link
3-) wav2lip model-: to train video on new audio to lip match the audio, hard to train because of lack of dataset availability(solved), you can find lib in this link
Although it is meant as a report, you can find parts of the code useful as Kaggle will help you a lot with training problems. (please refer to Kaggle folder)
I also recommend opening in colab to see results, but you will not able to reproduce them for now, wait until I enhance the usability
I will enhance the usability of this code soon until then I recommend the best part as Kaggle for re-use.