Skip to content

Latest commit

 

History

History
48 lines (39 loc) · 1.37 KB

todo.md

File metadata and controls

48 lines (39 loc) · 1.37 KB

📝 TO-DO List:

URGENT REQUIREMENTS

  • Change mask in seamless clone and give it a try
  • setup.bat / setup.sh
    • create venv
    • install requirements inside venv
  • CodeFormer arch initialization
  • Documentation

PREPROCESS

  • Add directory check in inference in the beginning.
  • Make preprocessing optimal.
  • Clear ram after no_face_filter.
  • Make face coordinates reusable:
    • Saving facial coordinates as .npy file.
    • Alter code to also include eye coordinates.

IMPROVING GAN UPSCALING

  • Merge Data Pipeline with preprocessor:
    • Remove need to recrop, realign and rewarp the image.

IMPROVING WAV2LIP

  • Merge all data Pipeline:
    • Remove the need to recrop, realign, renormalizing etc.
    • Devise a way to keep frames without face in the video.
      • Understand Mels and working of wav2lip model.

OPTIONAL

  • Gradio UI
    • A tab for Video, Audio and Output.
    • A tab for Image, Audio and output.

FURTHER IMPROVEMENTS

  • Inference without restorer
  • Model Improvement
  • Implement no_face_filter too

COLAB NOTEBOOK

  • Make it intuitive with proper instructions.
  • Optimize Inference.
  • Implement Checks.

FUTURE PLANS

  • Face and Audio wise Lipsync using face recognition.
  • A separate tab for TTS.