Talking-Face Research Papers (With GPT Analysis)

Updated on 2025.01.25 Current Search Keywords: Talking Face, Talking Head, Visual Dubbing, Face Genertation, Lip Sync, Talker, Portrait, Talking Video, Head Synthesis, Face Reenactment, Wav2Lip, Talking Avatar, Lip Generation, Lip-Synchronization, Portrait Animation, Facial Animation, Lip Expert

If you have any other keywords, please feel free to let us know :)

We now offer support for article analysis through large language models. You can view this feature by clicking the Paper Analysis link below. Currently, we are experimenting with Claude.ai or Moonshot AI. This is to help everyone quickly skim through the latest research papers.

Recent Trends (by AI)

Based on the provided snippets, I have identified the top five prominent keywords and synthesized the key themes, methodologies, findings, and shifts in perspective from the papers:
1. One-shot Talking Face Generation: The concept of generating realistic talking faces from a single image is a recurring theme across multiple papers. Techniques like NeRFFaceSpeech and AniTalker emphasize creating lifelike animations using minimal input data. These methods leverage generative models and audio-driven dynamics to produce natural-looking facial movements. The key challenge addressed is achieving high-quality synthesis while preserving identity and visual details.

2. Lip Synchronization and Audio-Visual Correlation: Ensuring accurate lip synchronization with corresponding audio is critical in talking face generation. Papers like "Audio-Visual Speech Representation Expert" and SwapTalk focus on synchronizing lip movements with audio while maintaining the visual quality of the generated faces. The methodologies involve advanced neural networks and latent space manipulation to enhance synchronization and minimize artifacts.

3. Real-time Rendering and Efficiency: The need for fast and efficient rendering is highlighted in works such as GSTalker. This model utilizes deformable Gaussian splatting to enable real-time audio-driven face generation. The emphasis is on reducing training time and improving rendering speeds without compromising the quality of the generated faces. This shift towards real-time applications reflects the growing demand for practical and scalable solutions in various domains.

4. Multimodal Emotion Representation: EMOPortraits introduces the integration of emotional expressions into talking face avatars. This approach enhances the realism and expressiveness of generated faces by incorporating emotion-driven dynamics. The methodology involves multimodal inputs and cross-driving synthesis, where avatars are animated with different emotional states, addressing the challenge of creating more engaging and lifelike digital avatars.

5. Identity Preservation and Customization: Maintaining the unique identity of the subject while generating talking faces is a crucial aspect explored in SwapTalk and AniTalker. These papers propose innovative solutions for identity-decoupled motion encoding and one-shot customization. The goal is to create personalized talking faces that retain the distinct features of the original subject, enabling applications in personalized media and communication.

Overall, the interconnectedness among these papers highlights a trend towards achieving higher realism, efficiency, and customization in talking face generation. The field is moving towards developing more practical and scalable solutions that can be applied in real-time scenarios, with an increasing focus on emotional expressiveness and identity preservation. Researchers are exploring advanced neural network architectures, generative models, and multimodal approaches to push the boundaries of what's possible in this rapidly evolving domain.

>>>> Each Paper Analysis (by AI) <<<<

Web Page (Scrape Code)

Table of Contents

Talking Face
Image Animation

Talking Face

2023-12-13, uTalk: Bridging the Gap Between Humans and AI, Hussam Azzuni et.al., Paper: http://arxiv.org/abs/2310.02739
2023-05-09, Zero-shot personalized lip-to-speech synthesis with face image based voice control, Zheng-Yan Sheng et.al., Paper: http://arxiv.org/abs/2305.14359
2017-07-18, You said that?, Joon Son Chung et.al., Paper: http://arxiv.org/abs/1705.02966
2024-03-27, X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention, You Xie et.al., Paper: http://arxiv.org/abs/2403.15931
2021-05-07, Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation, Lincheng Li et.al., Paper: http://arxiv.org/abs/2104.07995, Code: https://github.com/FuxiVirtualHuman/Write-a-Speaker
2020-05-07, What comprises a good talking-head video generation?: A Survey and Benchmark, Lele Chen et.al., Paper: http://arxiv.org/abs/2005.03201, Code: https://github.com/lelechen63/talking-head-generation-survey
2023-12-07, VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior, Xusen Sun et.al., Paper: http://arxiv.org/abs/2312.01841
2022-07-22, Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos, Panagiotis P. Filntisis et.al., Paper: http://arxiv.org/abs/2207.11094, Code: https://github.com/filby89/spectre
2014-09-03, Visual Speech Recognition, Ahmad B. A. Hassanat et.al., Paper: http://arxiv.org/abs/1409.1411
2018-05-24, VisemeNet: Audio-Driven Animator-Centric Speech Animation, Yang Zhou et.al., Paper: http://arxiv.org/abs/1805.09488
2022-07-20, VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection, Joanna Hong et.al., Paper: http://arxiv.org/abs/2206.07458
2024-03-22, Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing, Juan Zhang et.al., Paper: http://arxiv.org/abs/2403.11700
2022-11-27, VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild, Kun Cheng et.al., Paper: http://arxiv.org/abs/2211.14758
2025-01-07, VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control, Yuanpeng Tu et.al., Paper: http://arxiv.org/abs/2501.01427
2021-10-26, ViDA-MAN: Visual Dialog with Digital Humans, Tong Shen et.al., Paper: http://arxiv.org/abs/2110.13384
2023-08-11, Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space, Haoyu Wang et.al., Paper: http://arxiv.org/abs/2308.06076, Code: https://github.com/why986/VFA
2023-12-18, VectorTalker: SVG Talking Face Generation with Progressive Vectorisation, Hao Hu et.al., Paper: http://arxiv.org/abs/2312.11568
2023-04-24, VR Facial Animation for Immersive Telepresence Avatars, Andre Rochow et.al., Paper: http://arxiv.org/abs/2304.12051
2024-12-18, VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization, Tao Liu et.al., Paper: http://arxiv.org/abs/2412.09892
2023-08-11, VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer, Liyang Chen et.al., Paper: http://arxiv.org/abs/2308.04830
2024-04-16, VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time, Sicheng Xu et.al., Paper: http://arxiv.org/abs/2404.10667
2024-11-29, V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow, Jeongsoo Choi et.al., Paper: http://arxiv.org/abs/2411.19486
2012-09-22, Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis, Ingmar Steiner et.al., Paper: http://arxiv.org/abs/1209.4982
2022-05-27, Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast, Boqing Zhu et.al., Paper: http://arxiv.org/abs/2204.14057, Code: https://github.com/cocoxili/cmpc
2023-09-01, Unsupervised Learning of Style-Aware Facial Animation from Real Acting Performances, Wolfgang Paier et.al., Paper: http://arxiv.org/abs/2306.10006
2024-07-17, Universal Facial Encoding of Codec Avatars from VR Headsets, Shaojie Bai et.al., Paper: http://arxiv.org/abs/2407.13038
2024-08-01, UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model, Xiangyu Fan et.al., Paper: http://arxiv.org/abs/2408.00762
2021-08-12, UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing, Meng Cao et.al., Paper: http://arxiv.org/abs/2108.05650
2024-12-26, UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control, Wenzhang Sun et.al., Paper: http://arxiv.org/abs/2412.19860
2022-04-03, Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text, Pulkit Tandon et.al., Paper: http://arxiv.org/abs/2106.14014, Code: https://github.com/tpulkit/txt2vid
2022-04-06, Transformer-S2A: Robust and Efficient Speech-to-Animation, Liyang Chen et.al., Paper: http://arxiv.org/abs/2111.09771
2023-12-23, TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation, Xize Cheng et.al., Paper: http://arxiv.org/abs/2312.15197
2024-12-16, Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content, Rohit Kundu et.al., Paper: http://arxiv.org/abs/2412.12278
2024-04-07, Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation, Renshuai Liu et.al., Paper: http://arxiv.org/abs/2401.01207
2022-01-17, Towards Realistic Visual Dubbing with Heterogeneous Sources, Tianyi Xie et.al., Paper: http://arxiv.org/abs/2201.06260
2022-10-04, Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale, Aditya Agarwal et.al., Paper: http://arxiv.org/abs/2208.09796
2018-11-22, Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos, Ying Tai et.al., Paper: http://arxiv.org/abs/1811.00342, Code: https://github.com/tyshiwo/FHR_alignment
2020-03-01, Towards Automatic Face-to-Face Translation, Prajwal K R et.al., Paper: http://arxiv.org/abs/2003.00418, Code: https://github.com/Rudrabha/LipGAN
2023-08-24, ToonTalker: Cross-Domain Face Reenactment, Yuan Gong et.al., Paper: http://arxiv.org/abs/2308.12866
2024-10-15, Titanic Calling: Low Bandwidth Video Conference from the Titanic Wreck, Fevziye Irem Eyiokur et.al., Paper: http://arxiv.org/abs/2410.11434
2022-02-22, Thinking the Fusion Strategy of Multi-reference Face Reenactment, Takuya Yashima et.al., Paper: http://arxiv.org/abs/2202.10758
2022-03-29, Thin-Plate Spline Motion Model for Image Animation, Jian Zhao et.al., Paper: http://arxiv.org/abs/2203.14367, Code: https://github.com/yoyo-nb/thin-plate-spline-motion-model
2024-07-24, The impact of differences in facial features between real speakers and 3D face models on synthesized lip motions, Rabab Algadhy et.al., Paper: http://arxiv.org/abs/2407.17253
2023-10-23, The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills, Qingxiao Zheng et.al., Paper: http://arxiv.org/abs/2310.15112
2008-12-16, The Korrontea Data Modeling, Emmanuel Bouix et.al., Paper: http://arxiv.org/abs/0812.2988
2024-06-24, The Effects of Embodiment and Personality Expression on Learning in LLM-based Educational Agents, Sinan Sonlu et.al., Paper: http://arxiv.org/abs/2407.10993
2023-09-18, That's What I Said: Fully-Controllable Talking Face Generation, Youngjoon Jang et.al., Paper: http://arxiv.org/abs/2304.03275
2022-01-22, Text2Video: Text-driven Talking-head Video Synthesis with Personalized Phoneme-Pose Dictionary, Sibo Zhang et.al., Paper: http://arxiv.org/abs/2104.14631
2022-05-31, Text/Speech-Driven Full-Body Animation, Wenlin Zhuang et.al., Paper: http://arxiv.org/abs/2205.15573
2023-08-12, Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation, Zhichao Wang et.al., Paper: http://arxiv.org/abs/2308.06457, Code: https://github.com/zhichaowang970201/text-to-video
2024-01-18, Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models, Jeongsoo Choi et.al., Paper: http://arxiv.org/abs/2306.16003
2024-07-20, Text-based Talking Video Editing with Cascaded Conditional Diffusion, Bo Han et.al., Paper: http://arxiv.org/abs/2407.14841
2019-06-04, Text-based Editing of Talking-head Video, Ohad Fried et.al., Paper: http://arxiv.org/abs/1906.01524
2024-04-23, TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting, Jiahe Li et.al., Paper: http://arxiv.org/abs/2404.15264
2025-01-17, TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation, Yixiang Zhuang et.al., Paper: http://arxiv.org/abs/2501.09921
2020-07-16, Talking-head Generation with Rhythmic Head Motion, Lele Chen et.al., Paper: http://arxiv.org/abs/2007.08547, Code: https://github.com/lelechen63/Talking-head-Generation-with-Rhythmic-Head-Motion
2020-03-05, Talking-Heads Attention, Noam Shazeer et.al., Paper: http://arxiv.org/abs/2003.02436, Code: https://github.com/zygmuntz/hyperband
2024-06-13, Talking Heads: Understanding Inter-layer Communication in Transformer Language Models, Jack Merullo et.al., Paper: http://arxiv.org/abs/2406.09519
2023-11-30, Talking Head(?) Anime from a Single Image 4: Improved Model and Its Distillation, Pramook Khungurn et.al., Paper: http://arxiv.org/abs/2311.17409
2022-09-09, Talking Head from Speech Audio using a Pre-trained Image Generator, Mohammed M. Alghamdi et.al., Paper: http://arxiv.org/abs/2209.04252
2022-12-07, Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors, Zhentao Yu et.al., Paper: http://arxiv.org/abs/2212.04248
2021-10-19, Talking Head Generation with Audio and Speech Related Facial Action Units, Sen Chen et.al., Paper: http://arxiv.org/abs/2110.09951
2022-04-27, Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion, Sen Chen et.al., Paper: http://arxiv.org/abs/2204.12756
2022-05-13, Talking Face Generation with Multilingual TTS, Hyoung-Kyu Song et.al., Paper: http://arxiv.org/abs/2205.06421
2019-07-25, Talking Face Generation by Conditional Recurrent Adversarial Network, Yang Song et.al., Paper: http://arxiv.org/abs/1804.04786, Code: https://github.com/susanqq/Talking_Face_Generation
2019-04-23, Talking Face Generation by Adversarially Disentangled Audio-Visual Representation, Hang Zhou et.al., Paper: http://arxiv.org/abs/1807.07860
2024-08-25, TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation, Jack Saunders et.al., Paper: http://arxiv.org/abs/2408.13714
2023-04-01, TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles, Yifeng Ma et.al., Paper: http://arxiv.org/abs/2304.00334
2024-03-29, Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior, Jaehoon Ko et.al., Paper: http://arxiv.org/abs/2403.20153, Code: https://github.com/KU-CVLAB/Talk3D
2024-10-18, Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization, Bin Lin et.al., Paper: http://arxiv.org/abs/2410.14283
2025-01-06, Takeaways from Applying LLM Capabilities to Multiple Conversational Avatars in a VR Pilot Study, Mykola Maslych et.al., Paper: http://arxiv.org/abs/2501.00168
2024-04-13, THQA: A Perceptual Quality Assessment Database for Talking Heads, Yingjie Zhou et.al., Paper: http://arxiv.org/abs/2404.09003, Code: https://github.com/zyj-2000/thqa
2023-11-28, THInImg: Cross-modal Steganography for Presenting Talking Heads in Images, Lin Zhao et.al., Paper: http://arxiv.org/abs/2311.17177
2024-10-14, TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model, Jiazhi Guan et.al., Paper: http://arxiv.org/abs/2410.10696
2023-11-08, Synthetic Speaking Children -- Why We Need Them and How to Make Them, Muhammad Ali Farooq et.al., Paper: http://arxiv.org/abs/2311.06307
2023-03-24, Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement, Siddarth Ravichandran et.al., Paper: http://arxiv.org/abs/2209.01320
2024-12-01, Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation, Shuling Zhao et.al., Paper: http://arxiv.org/abs/2412.00719
2022-11-03, SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory, Se Jin Park et.al., Paper: http://arxiv.org/abs/2211.00924
2023-11-29, SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis, Ziqiao Peng et.al., Paper: http://arxiv.org/abs/2311.17590, Code: https://github.com/ZiqiaoPeng/SyncTalk
2024-05-09, SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space, Zeren Zhang et.al., Paper: http://arxiv.org/abs/2405.05636
2024-03-26, Superior and Pragmatic Talking Face Generation with Teacher-Student Framework, Chao Liang et.al., Paper: http://arxiv.org/abs/2403.17883
2024-03-15, StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation, Dongchan Min et.al., Paper: http://arxiv.org/abs/2208.10922
2023-06-10, StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles, Yifeng Ma et.al., Paper: http://arxiv.org/abs/2301.01081, Code: https://github.com/fuxivirtualhuman/styletalk
2024-09-14, StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads, Suzhen Wang et.al., Paper: http://arxiv.org/abs/2409.09292
2023-05-09, StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator, Jiazhi Guan et.al., Paper: http://arxiv.org/abs/2305.05445
2022-09-27, StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment, Stella Bounareli et.al., Paper: http://arxiv.org/abs/2209.13375, Code: https://github.com/stelabou/stylemask
2024-02-12, StyleLipSync: Style-based Personalized Lip-sync Video Generation, Taekyung Ki et.al., Paper: http://arxiv.org/abs/2305.00521
2022-03-17, StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN, Fei Yin et.al., Paper: http://arxiv.org/abs/2203.04036, Code: https://github.com/FeiiYin/StyleHEAT
2024-02-21, StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing, Gaoxiang Cong et.al., Paper: http://arxiv.org/abs/2402.12636
2023-05-01, StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video, Lizhen Wang et.al., Paper: http://arxiv.org/abs/2305.00942, Code: https://github.com/lizhenwangt/styleavatar
2024-03-12, Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style, Shuai Tan et.al., Paper: http://arxiv.org/abs/2403.06365
2024-08-10, Style-Preserving Lip Sync via Audio-Aware Style Reference, Weizhi Zhong et.al., Paper: http://arxiv.org/abs/2408.05412
2023-03-22, Style Transfer for 2D Talking Head Animation, Trong-Thang Pham et.al., Paper: http://arxiv.org/abs/2303.09799, Code: https://github.com/aioz-ai/audiodrivenstyletransfer
2023-12-11, Study of Non-Verbal Behavior in Conversational Agents, Camila Vicari Maccari et.al., Paper: http://arxiv.org/abs/2312.06530
2021-10-07, Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution, Yangyang Shi et.al., Paper: http://arxiv.org/abs/2110.05241
2020-11-21, Stochastic Talking Face Generation Using Latent Distribution Matching, Ravindra Yadav et.al., Paper: http://arxiv.org/abs/2011.10727, Code: https://github.com/ry85/Stochastic-Talking-Face-Generation-Using-Latent-Distribution-Matching
2022-01-21, Stitch it in Time: GAN-Based Facial Editing of Real Videos, Rotem Tzaban et.al., Paper: http://arxiv.org/abs/2201.08361, Code: https://github.com/rotemtzaban/STIT
2024-10-31, Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts, Xiang Deng et.al., Paper: http://arxiv.org/abs/2410.23836
2022-08-29, StableFace: Analyzing and Improving Motion Stability for Talking Face Generation, Jun Ling et.al., Paper: http://arxiv.org/abs/2208.13717
2024-09-26, Stable Video Portraits, Mirela Ostrek et.al., Paper: http://arxiv.org/abs/2409.18083
2024-04-08, SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation, Heyuan Li et.al., Paper: http://arxiv.org/abs/2404.05680
2021-07-10, Speech2Video: Cross-Modal Distillation for Speech to Video Generation, Shijing Si et.al., Paper: http://arxiv.org/abs/2107.04806
2023-09-09, Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video, Xiuzhe Wu et.al., Paper: http://arxiv.org/abs/2309.04814, Code: https://github.com/cvmi-lab/speech2lip
2020-02-19, Speech-driven facial animation using polynomial fusion of features, Triantafyllos Kefalas et.al., Paper: http://arxiv.org/abs/1912.05833
2018-03-20, Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks, Seyed Ali Jalalifar et.al., Paper: http://arxiv.org/abs/1803.07461
2021-11-29, Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates, Shenhan Qian et.al., Paper: http://arxiv.org/abs/2108.08020, Code: https://github.com/shenhanqian/speechdrivestemplates
2021-07-21, Speech Driven Talking Face Generation from a Single Image and an Emotion Condition, Sefik Emre Eskimez et.al., Paper: http://arxiv.org/abs/2008.03592, Code: https://github.com/eeskimez/emotalkingface
1994-06-01, Speech Dialogue with Facial Displays: Multimodal Human-Computer Conversation, Katashi Nagao et.al., Paper: http://arxiv.org/abs/cmp-lg/9406002
2020-08-04, Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract, Tamás Gábor Csapó et.al., Paper: http://arxiv.org/abs/2008.02098, Code: https://github.com/BME-SmartLab/speech2mri
2020-06-20, Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams, Huirong Huang et.al., Paper: http://arxiv.org/abs/2006.11610
2022-10-13, Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors, Vladimir Iashin et.al., Paper: http://arxiv.org/abs/2210.07055, Code: https://github.com/v-iashin/sparsesync
2024-11-25, Sonic: Shifting Focus to Global Audio Perception in Portrait Animation, Xiaozhong Ji et.al., Paper: http://arxiv.org/abs/2411.16331
2021-08-06, SofGAN: A Portrait Image Generator with Dynamic Styling, Anpei Chen et.al., Paper: http://arxiv.org/abs/2007.03780, Code: https://github.com/apchenstu/sofgan
2021-04-07, Single Source One Shot Reenactment using Weighted motion From Paired Feature Points, Soumya Tripathy et.al., Paper: http://arxiv.org/abs/2104.03117
2023-12-08, SingingHead: A Large-scale 4D Dataset for Singing Head Animation, Sijing Wu et.al., Paper: http://arxiv.org/abs/2312.04369
2009-12-03, Sequential Clustering based Facial Feature Extraction Method for Automatic Creation of Facial Models from Orthogonal Views, Alireza Ghahari et.al., Paper: http://arxiv.org/abs/0912.0600
2023-08-30, SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces, Ziqiao Peng et.al., Paper: http://arxiv.org/abs/2306.10799, Code: https://github.com/psyai-net/SelfTalk_release
2022-01-24, Selective Listening by Synchronizing Speech with Lips, Zexu Pan et.al., Paper: http://arxiv.org/abs/2106.07150, Code: https://github.com/zexupan/reentry
2024-09-05, SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing, Lingyu Xiong et.al., Paper: http://arxiv.org/abs/2409.03605
2020-09-02, Seeing wake words: Audio-visual Keyword Spotting, Liliane Momeni et.al., Paper: http://arxiv.org/abs/2009.01225
2023-03-29, Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert, Jiadong Wang et.al., Paper: http://arxiv.org/abs/2303.17480, Code: https://github.com/sxjdwang/talklip
2024-03-19, ScanTalk: 3D Talking Heads from Unregistered Scans, Federico Nocentini et.al., Paper: http://arxiv.org/abs/2403.10942
2024-03-13, Say Anything with Any Style, Shuai Tan et.al., Paper: http://arxiv.org/abs/2403.06363
2024-08-18, S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis, Dongze Li et.al., Paper: http://arxiv.org/abs/2408.09347
2024-09-05, SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model, Weipeng Tan et.al., Paper: http://arxiv.org/abs/2409.03270
2022-12-07, SPACE: Speech-driven Portrait Animation with Controllable Expression, Siddharth Gururani et.al., Paper: http://arxiv.org/abs/2211.09809
2020-10-05, SMILE: Semantically-guided Multi-attribute Image and Layout Editing, Andrés Romero et.al., Paper: http://arxiv.org/abs/2010.02315, Code: https://github.com/affromero/SMILE
2024-12-04, SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model, Yan Li et.al., Paper: http://arxiv.org/abs/2412.03430
2024-01-25, SAiD: Speech-driven Blendshape Facial Animation with Diffusion, Inkyu Park et.al., Paper: http://arxiv.org/abs/2401.08655, Code: https://github.com/yunik1004/said
2023-07-03, RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations, Neha Sahipjohn et.al., Paper: http://arxiv.org/abs/2307.01233
2020-12-14, Robust One Shot Audio to Video Generation, Neeraj Kumar et.al., Paper: http://arxiv.org/abs/2012.07842
2022-09-07, Restructurable Activation Networks, Kartikeya Bhardwaj et.al., Paper: http://arxiv.org/abs/2208.08562, Code: https://github.com/arm-software/ml-restructurable-activation-networks
2022-07-20, Responsive Listening Head Generation: A Benchmark Dataset and Baseline, Mohan Zhou et.al., Paper: http://arxiv.org/abs/2112.13548
2024-02-26, Resolution-Agnostic Neural Compression for High-Fidelity Portrait Video Conferencing via Implicit Radiance Fields, Yifei Li et.al., Paper: http://arxiv.org/abs/2402.16599
2023-05-22, RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars, Dongwei Pan et.al., Paper: http://arxiv.org/abs/2305.13353, Code: https://github.com/renderme-360/renderme-360
2023-06-08, ReliableSwap: Boosting General Face Swapping Via Reliable Supervision, Ge Yuan et.al., Paper: http://arxiv.org/abs/2306.05356, Code: https://github.com/ygtxr1997/reliableswap
2018-07-29, ReenactGAN: Learning to Reenact Faces via Boundary Transfer, Wayne Wu et.al., Paper: http://arxiv.org/abs/1807.11079, Code: https://github.com/wywu/ReenactGAN
2024-08-01, Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion, Manuel Kansy et.al., Paper: http://arxiv.org/abs/2408.00458
2019-06-14, Realistic Speech-Driven Facial Animation with GANs, Konstantinos Vougioukas et.al., Paper: http://arxiv.org/abs/1906.06337
2020-03-29, Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose, Xianfang Zeng et.al., Paper: http://arxiv.org/abs/2003.12957
2024-06-26, RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network, Xiaozhong Ji et.al., Paper: http://arxiv.org/abs/2406.18284
2024-03-23, Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis, Zhenhui Ye et.al., Paper: http://arxiv.org/abs/2401.08503, Code: https://github.com/yerfor/Real3DPortrait
2021-03-05, Real-time RGBD-based Extended Body Pose Estimation, Renat Bashirov et.al., Paper: http://arxiv.org/abs/2103.03663, Code: https://github.com/rmbashirov/rgbd-kinect-pose
2024-12-18, Real-time One-Step Diffusion-based Expressive Portrait Videos Generation, Hanzhong Guo et.al., Paper: http://arxiv.org/abs/2412.13479, Code: https://github.com/Guohanzhong/OSA-LCM
2022-11-22, Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition, Jiaxiang Tang et.al., Paper: http://arxiv.org/abs/2211.12368
2024-10-24, Real-time 3D-aware Portrait Video Relighting, Ziqi Cai et.al., Paper: http://arxiv.org/abs/2410.18355, Code: https://github.com/GhostCai/PortraitRelighting
2019-10-19, Real-Time Lip Sync for Live 2D Animation, Deepali Aneja et.al., Paper: http://arxiv.org/abs/1910.08685, Code: https://github.com/deepalianeja/CharacterLipSync2D
2020-08-04, Real-Time Cleaning and Refinement of Facial Animation Signals, Eloïse Berson et.al., Paper: http://arxiv.org/abs/2008.01332
2024-07-12, Real Face Video Animation Platform, Xiaokai Chen et.al., Paper: http://arxiv.org/abs/2407.18955
2024-08-06, ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer, Jiazhi Guan et.al., Paper: http://arxiv.org/abs/2408.03284
2010-03-01, Re-verification of a Lip Synchronization Protocol using Robust Reachability, Piotr Kordy et.al., Paper: http://arxiv.org/abs/1003.0431
2024-06-18, RITA: A Real-time Interactive Talking Avatars Framework, Wuxinlin Cheng et.al., Paper: http://arxiv.org/abs/2406.13093
2025-01-06, RDD4D: 4D Attention-Guided Road Damage Detection And Classification, Asma Alkalbani et.al., Paper: http://arxiv.org/abs/2501.02822, Code: https://github.com/msaqib17/road_damage_detection
2023-11-06, RADIO: Reference-Agnostic Dubbing Video Synthesis, Dongyeun Lee et.al., Paper: http://arxiv.org/abs/2309.01950
2023-12-09, R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning, Zhiling Ye et.al., Paper: http://arxiv.org/abs/2312.05572
2019-08-20, Prosodic Phrase Alignment for Machine Dubbing, Alp Öktem et.al., Paper: http://arxiv.org/abs/1908.07226, Code: https://github.com/alpoktem/MachineDub
2022-11-26, Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis, Duomin Wang et.al., Paper: http://arxiv.org/abs/2211.14506, Code: https://github.com/Dorniwang/PD-FGC-inference
2012-01-19, Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis, Ingmar Steiner et.al., Paper: http://arxiv.org/abs/1201.4080
2024-09-25, ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE, Sichun Wu et.al., Paper: http://arxiv.org/abs/2409.07966, Code: https://github.com/uuembodiedsocialai/probtalk3d
2023-07-09, Predictive Coding For Animation-Based Video Compression, Goluck Konuko et.al., Paper: http://arxiv.org/abs/2307.04187
2022-10-13, Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar, Aolan Sun et.al., Paper: http://arxiv.org/abs/2210.06877
2024-09-04, PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation, Jun Ling et.al., Paper: http://arxiv.org/abs/2409.02657
2021-04-22, Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation, Hang Zhou et.al., Paper: http://arxiv.org/abs/2104.11116, Code: https://github.com/Hangz-nju-cuhk/Talking-Face_PC-AVS
2023-02-24, Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention, Bin Liu et.al., Paper: http://arxiv.org/abs/2302.12532
2024-12-10, PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation, Fatemeh Nazarieh et.al., Paper: http://arxiv.org/abs/2412.07754
2024-05-14, PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset, Yang Hou et.al., Paper: http://arxiv.org/abs/2405.08838, Code: https://github.com/tobuta/PolyGlotFake
2024-12-11, PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis, Yifan Xie et.al., Paper: http://arxiv.org/abs/2412.08504
2007-08-28, Plate-forme Magicien d'Oz pour l'étude de l'apport des ACAs à l'interaction, Jérôme Simonin et.al., Paper: http://arxiv.org/abs/0708.3740
2023-10-25, Personalized Speech-driven Expressive 3D Facial Animation Synthesis with Style Control, Elif Bozkurt et.al., Paper: http://arxiv.org/abs/2310.17011
2024-09-09, PersonaTalk: Bring Attention to Your Persona in Visual Dubbing, Longhao Zhang et.al., Paper: http://arxiv.org/abs/2409.05379
2022-08-02, Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer, Ailin Huang et.al., Paper: http://arxiv.org/abs/2206.12837, Code: https://github.com/megvii-research/MM2022-ViCoPerceptualHeadGeneration
2024-11-26, Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey, Hong-Hanh Nguyen-Le et.al., Paper: http://arxiv.org/abs/2411.17911
2023-06-13, Parametric Implicit Face Representation for Audio-Driven Facial Reenactment, Ricong Huang et.al., Paper: http://arxiv.org/abs/2306.07579
2021-12-20, Parallel and High-Fidelity Text-to-Lip Generation, Jinglin Liu et.al., Paper: http://arxiv.org/abs/2107.06831, Code: https://github.com/Dianezzy/ParaLip
2023-08-29, Papeos: Augmenting Research Papers with Talk Videos, Tae Soo Kim et.al., Paper: http://arxiv.org/abs/2308.15224
2023-03-23, PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 $^{\circ}$, Sizhe An et.al., Paper: http://arxiv.org/abs/2303.13071
2023-12-05, PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features, Tianshun Han et.al., Paper: http://arxiv.org/abs/2312.02781
2023-09-13, PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network, Qinghua Liu et.al., Paper: http://arxiv.org/abs/2309.06723
2024-07-22, PAV: Personalized Head Avatar from Unstructured Video Collection, Akin Caliskan et.al., Paper: http://arxiv.org/abs/2407.21047
2024-05-28, OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance, Shuheng Ge et.al., Paper: http://arxiv.org/abs/2405.14709
2021-12-06, One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning, Suzhen Wang et.al., Paper: http://arxiv.org/abs/2112.02749
2024-02-05, One-shot Neural Face Reenactment via Finding Directions in GAN's Latent Space, Stella Bounareli et.al., Paper: http://arxiv.org/abs/2402.03553
2021-04-26, One-shot Face Reenactment Using Appearance Adaptive Normalization, Guangming Yao et.al., Paper: http://arxiv.org/abs/2102.03984
2019-08-05, One-shot Face Reenactment, Yunxuan Zhang et.al., Paper: http://arxiv.org/abs/1908.03251, Code: https://github.com/bj80heyue/Learning_One_Shot_Face_Reenactment
2024-07-12, One-Shot Pose-Driving Face Animation Platform, He Feng et.al., Paper: http://arxiv.org/abs/2407.08949
2023-04-11, One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field, Weichuang Li et.al., Paper: http://arxiv.org/abs/2304.05097
2021-04-02, One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing, Ting-Chun Wang et.al., Paper: http://arxiv.org/abs/2011.15126
2022-05-26, One-Shot Face Reenactment on Megapixels, Wonjun Kang et.al., Paper: http://arxiv.org/abs/2205.13368
2024-12-02, One Shot, One Talk: Whole-body Talking Avatar from a Single Image, Jun Xiang et.al., Paper: http://arxiv.org/abs/2412.01106
2021-02-19, One Shot Audio to Animated Video Generation, Neeraj Kumar et.al., Paper: http://arxiv.org/abs/2102.09737
2022-11-10, On the role of Lip Articulation in Visual Speech Perception, Zakaria Aldeneh et.al., Paper: http://arxiv.org/abs/2203.10117
2023-10-29, On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models, Marija Ivanovska et.al., Paper: http://arxiv.org/abs/2307.05397
2023-03-27, OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis, Hongyi Xu et.al., Paper: http://arxiv.org/abs/2303.15539
2017-12-06, ObamaNet: Photo-realistic lip-sync from text, Rithesh Kumar et.al., Paper: http://arxiv.org/abs/1801.01442
2023-03-26, OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering, Zhiyuan Ma et.al., Paper: http://arxiv.org/abs/2303.14662, Code: https://github.com/theericma/otavatar
2023-09-28, OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions, Jin Liu et.al., Paper: http://arxiv.org/abs/2309.16148
2023-02-16, OPT: One-shot Pose-Controllable Talking Head Generation, Jin Liu et.al., Paper: http://arxiv.org/abs/2302.08197
2023-07-19, OPHAvatars: One-shot Photo-realistic Head Avatars, Shaoxu Li et.al., Paper: http://arxiv.org/abs/2307.09153, Code: https://github.com/lsx0101/ophavatars
2021-03-20, Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization, Komal Chugh et.al., Paper: http://arxiv.org/abs/2005.14405, Code: https://github.com/abhinavdhall/deepfake
2023-01-20, Neural Volumetric Blendshapes: Computationally Efficient Physics-Based Facial Blendshapes, Nicolas Wagner et.al., Paper: http://arxiv.org/abs/2212.14784
2020-07-29, Neural Voice Puppetry: Audio-driven Facial Reenactment, Justus Thies et.al., Paper: http://arxiv.org/abs/1912.05566, Code: https://github.com/miu200521358/NeuralVoicePuppetryMMD
2023-12-11, Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism, Georgios Milis et.al., Paper: http://arxiv.org/abs/2312.06613, Code: https://github.com/g-milis/NEUTART
2019-09-06, Neural Style-Preserving Visual Dubbing, Hyeongwoo Kim et.al., Paper: http://arxiv.org/abs/1909.02518
2023-08-10, Near-realtime Facial Animation by Deep 3D Simulation Super-Resolution, Hyojoon Park et.al., Paper: http://arxiv.org/abs/2305.03216
2024-05-10, NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior, Gihoon Kim et.al., Paper: http://arxiv.org/abs/2405.05749
2024-01-23, NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face Synthesis, Chongke Bi et.al., Paper: http://arxiv.org/abs/2401.12568
2023-06-12, NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection, Yu Chen et.al., Paper: http://arxiv.org/abs/2306.06885
2024-06-17, NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation, Niu Guanchen et.al., Paper: http://arxiv.org/abs/2406.11259
2022-07-20, NARRATE: A Normal Assisted Free-View Portrait Stylizer, Youjia Wang et.al., Paper: http://arxiv.org/abs/2207.00974
2023-12-05, MyPortrait: Morphable Prior-Guided Personalized Portrait Generation, Bo Ding et.al., Paper: http://arxiv.org/abs/2312.02703
2024-10-16, MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting, Yue Zhang et.al., Paper: http://arxiv.org/abs/2410.10122, Code: https://github.com/tmelyralab/musetalk
2024-05-31, MunchSonic: Tracking Fine-grained Dietary Actions through Active Acoustic Sensing on Eyeglasses, Saif Mahmud et.al., Paper: http://arxiv.org/abs/2405.21004
2023-05-09, Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator, Chao Xu et.al., Paper: http://arxiv.org/abs/2305.02594
2024-10-29, Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing, Haonan Tong et.al., Paper: http://arxiv.org/abs/2410.22112
2017-07-21, Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking, Rahul Sharma et.al., Paper: http://arxiv.org/abs/1707.06830
2024-06-20, MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset, Kim Sung-Bin et.al., Paper: http://arxiv.org/abs/2406.14272
2022-03-04, Multi-modality Deep Restoration of Extremely Compressed Face Videos, Xi Zhang et.al., Paper: http://arxiv.org/abs/2107.05548
2020-12-14, Multi Modal Adaptive Normalization for Audio to Video Generation, Neeraj Kumar et.al., Paper: http://arxiv.org/abs/2012.07304
2020-05-27, Modality Dropout for Improved Performance-driven Talking Faces, Ahmed Hussen Abdelaziz et.al., Paper: http://arxiv.org/abs/2005.13616
2024-07-08, MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices, Jianwen Jiang et.al., Paper: http://arxiv.org/abs/2407.05712
2025-01-09, MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation, Huaize Liu et.al., Paper: http://arxiv.org/abs/2501.01808
2024-03-28, MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation, Seyeon Kim et.al., Paper: http://arxiv.org/abs/2403.19144, Code: https://github.com/KU-CVLAB/MoDiTalker
2024-10-15, MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes, Zhenhui Ye et.al., Paper: http://arxiv.org/abs/2410.06734
2023-12-18, Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation, Hui Fu et.al., Paper: http://arxiv.org/abs/2312.10877
2024-08-28, Micro and macro facial expressions by driven animations in realistic Virtual Humans, Rubens Halbig Montanha et.al., Paper: http://arxiv.org/abs/2408.16110
2024-05-22, Metabook: An Automatically Generated Augmented Reality Storybook Interaction System to Improve Children's Engagement in Storytelling, Yibo Wang et.al., Paper: http://arxiv.org/abs/2405.13701
2023-03-27, MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation, Bowen Zhang et.al., Paper: http://arxiv.org/abs/2212.08062, Code: https://github.com/Meta-Portrait/MetaPortrait
2024-08-18, Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation, Xukun Zhou et.al., Paper: http://arxiv.org/abs/2408.09357
2022-05-20, MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement, Alexander Richard et.al., Paper: http://arxiv.org/abs/2104.08223, Code: https://github.com/facebookresearch/meshtalk
2020-09-18, Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks, Guangming Yao et.al., Paper: http://arxiv.org/abs/2008.07783
2022-05-24, Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts, Debjoy Saha et.al., Paper: http://arxiv.org/abs/2205.12194, Code: https://github.com/deeplsd/merkel-podcast-corpus
2023-11-20, MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI, Lifei Zheng et.al., Paper: http://arxiv.org/abs/2311.14730
2023-02-27, Memory-augmented Contrastive Learning for Talking Head Generation, Jianrong Wang et.al., Paper: http://arxiv.org/abs/2302.13469, Code: https://github.com/yaxinzhao97/macl
2024-03-05, Memories are One-to-Many Mapping Alleviators in Talking Face Generation, Anni Tang et.al., Paper: http://arxiv.org/abs/2212.05005
2024-05-31, MegActor: Harness the Power of Raw Video for Vivid Portrait Animation, Shurong Yang et.al., Paper: http://arxiv.org/abs/2405.20851, Code: https://github.com/megvii-research/megfaceanimate
2024-08-27, MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer, Shurong Yang et.al., Paper: http://arxiv.org/abs/2408.14975
2024-01-30, Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance, Qingcheng Zhao et.al., Paper: http://arxiv.org/abs/2401.15687
2022-12-09, Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers, Yasheng Sun et.al., Paper: http://arxiv.org/abs/2212.04970
2023-09-10, MaskRenderer: 3D-Infused Multi-Mask Realistic Face Reenactment, Tina Behrouzi et.al., Paper: http://arxiv.org/abs/2309.05095
2019-11-19, MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets, Sungjoo Ha et.al., Paper: http://arxiv.org/abs/1911.08139
2021-02-25, MakeItTalk: Speaker-Aware Talking-Head Animation, Yang Zhou et.al., Paper: http://arxiv.org/abs/2004.12992
2024-03-25, Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework, Ziyao Huang et.al., Paper: http://arxiv.org/abs/2403.16510, Code: https://github.com/ictmcg/make-your-anchor
2025-01-15, Make-A-Character 2: Animatable 3D Character Generation From a Single Image, Lin Liu et.al., Paper: http://arxiv.org/abs/2501.07870
2024-06-17, Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement, Runyi Yu et.al., Paper: http://arxiv.org/abs/2406.08096
2023-07-19, MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions, Yunfei Liu et.al., Paper: http://arxiv.org/abs/2307.10008
2024-10-10, MMHead: Towards Fine-grained Multi-modal 3D Facial Animation, Sijing Wu et.al., Paper: http://arxiv.org/abs/2410.07757
2023-12-13, MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation, Haozhe Wu et.al., Paper: http://arxiv.org/abs/2303.09797
2024-01-31, MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis, Wenhao Guan et.al., Paper: http://arxiv.org/abs/2312.10687
2024-09-23, MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning, Yue Han et.al., Paper: http://arxiv.org/abs/2409.15179
2024-04-03, MI-NeRF: Learning a Single Face NeRF from Multiple Identities, Aggelina Chatziagapi et.al., Paper: http://arxiv.org/abs/2403.19920
2024-12-05, MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation, Longtao Zheng et.al., Paper: http://arxiv.org/abs/2412.04448
2023-03-22, MARLIN: Masked Autoencoder for facial video Representation LearnINg, Zhixi Cai et.al., Paper: http://arxiv.org/abs/2211.06627, Code: https://github.com/ControlNet/MARLIN
2024-11-29, LokiTalk: Learning Fine-Grained and Generalizable Correspondences to Enhance NeRF-based Talking Head Synthesis, Tianqi Li et.al., Paper: http://arxiv.org/abs/2411.19525
2024-07-03, LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control, Jianzhu Guo et.al., Paper: http://arxiv.org/abs/2407.03168, Code: https://github.com/KwaiVGI/LivePortrait
2021-09-24, Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation, Yuanxun Lu et.al., Paper: http://arxiv.org/abs/2109.10595
2024-05-12, Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation, Changpeng Cai et.al., Paper: http://arxiv.org/abs/2405.07257
2024-01-28, Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes, Weifeng Liu et.al., Paper: http://arxiv.org/abs/2401.15668, Code: https://github.com/aaroncomo/lipfd
2021-06-08, LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization, Avisek Lahiri et.al., Paper: http://arxiv.org/abs/2106.04185
2017-01-30, Lip Reading Sentences in the Wild, Joon Son Chung et.al., Paper: http://arxiv.org/abs/1611.05358, Code: https://github.com/parambadiger/Lip-Reading
2024-07-26, LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement, Rui Zhang et.al., Paper: http://arxiv.org/abs/2407.18595
2022-10-21, Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection, Alexandros Haliassos et.al., Paper: http://arxiv.org/abs/2201.07131, Code: https://github.com/ahaliassos/RealForensics
2024-11-24, LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis, Haojie Zhang et.al., Paper: http://arxiv.org/abs/2411.16748
2024-04-02, Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation, Taekyung Ki et.al., Paper: http://arxiv.org/abs/2404.00636
2024-12-18, Learning to Control an Android Robot Head for Facial Animation, Marcel Heisler et.al., Paper: http://arxiv.org/abs/2412.13641
2024-02-29, Learning a Generalized Physical Face Model From Data, Lingchen Yang et.al., Paper: http://arxiv.org/abs/2402.19477
2020-07-08, Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision, Abhinav Shukla et.al., Paper: http://arxiv.org/abs/2007.04134
2023-11-03, Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading, Songtao Luo et.al., Paper: http://arxiv.org/abs/2310.05058, Code: https://github.com/jinchiniao/LSHUC
2024-07-13, Learning Online Scale Transformation for Talking Head Video Generation, Fa-Ting Hong et.al., Paper: http://arxiv.org/abs/2407.09965
2023-11-30, Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data, Yu Deng et.al., Paper: http://arxiv.org/abs/2311.18729
2023-07-26, Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation, Federico Nocentini et.al., Paper: http://arxiv.org/abs/2306.01415, Code: https://github.com/fedenoce/s2l-s2d
2024-09-29, Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation, Jingyi Xu et.al., Paper: http://arxiv.org/abs/2409.19501
2024-02-27, Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis, Zicheng Zhang et.al., Paper: http://arxiv.org/abs/2402.17364, Code: https://github.com/zhangzc21/dyntet
2022-07-24, Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis, Shuai Shen et.al., Paper: http://arxiv.org/abs/2207.11770, Code: https://github.com/sstzal/DFRF
2023-12-19, Learning Dense Correspondence for NeRF-Based Face Reenactment, Songlin Yang et.al., Paper: http://arxiv.org/abs/2312.10422
2023-01-15, Learning Audio-Driven Viseme Dynamics for 3D Face Animation, Linchao Bao et.al., Paper: http://arxiv.org/abs/2301.06059
2021-04-29, Learned Spatial Representations for Few-shot Talking-Head Synthesis, Moustafa Meshry et.al., Paper: http://arxiv.org/abs/2104.14557
2018-07-26, Learnable PINs: Cross-Modal Embeddings for Person Identity, Arsha Nagrani et.al., Paper: http://arxiv.org/abs/1805.00833
2024-04-19, Learn2Talk: 3D Talking Face Learns from 2D Talking Face, Yixiang Zhuang et.al., Paper: http://arxiv.org/abs/2404.12888
2024-03-22, LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example, Soyeon Yoon et.al., Paper: http://arxiv.org/abs/2403.15227, Code: https://github.com/kwanyun/LeGO_code
2023-08-30, Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models, Antoni Bigata Casademunt et.al., Paper: http://arxiv.org/abs/2305.08854, Code: https://github.com/antonibigata/Laughing-Matters
2023-11-02, LaughTalk: Expressive 3D Talking Head Generation with Laughter, Kim Sung-Bin et.al., Paper: http://arxiv.org/abs/2311.00994
2024-12-12, LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync, Chunyu Li et.al., Paper: http://arxiv.org/abs/2412.09262, Code: https://github.com/bytedance/LatentSync
2020-11-06, Large-scale multilingual audio visual dubbing, Yi Yang et.al., Paper: http://arxiv.org/abs/2011.03530
2016-07-11, Large-Scale MIMO is Capable of Eliminating Power-Thirsty Channel Coding for Wireless Transmission of HEVC/H.265 Video, Shaoshi Yang et.al., Paper: http://arxiv.org/abs/1601.06684
2024-11-06, Large Generative Model-assisted Talking-face Semantic Communication System, Feibo Jiang et.al., Paper: http://arxiv.org/abs/2411.03876
2024-08-03, Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation, Jintao Tan et.al., Paper: http://arxiv.org/abs/2408.01732
2024-10-01, LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details, Jian Yang et.al., Paper: http://arxiv.org/abs/2410.00990
2023-05-17, LPMM: Intuitive Pose Control for Neural Talking-Head Model via Landmark-Parameter Morphable Model, Kwangho Lee et.al., Paper: http://arxiv.org/abs/2305.10456
2021-04-07, LI-Net: Large-Pose Identity-Preserving Face Reenactment Network, Jin Liu et.al., Paper: http://arxiv.org/abs/2104.02850
2024-11-14, LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space, Guanwen Feng et.al., Paper: http://arxiv.org/abs/2411.09268
2021-08-23, KoDF: A Large-scale Korean DeepFake Detection Dataset, Patrick Kwon et.al., Paper: http://arxiv.org/abs/2103.10094
2017-07-30, Kernel Projection of Latent Structures Regression for Facial Animation Retargeting, Christos Ouzounis et.al., Paper: http://arxiv.org/abs/1707.09629
2024-09-02, KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding, Zhihao Xu et.al., Paper: http://arxiv.org/abs/2409.01113, Code: https://github.com/ffxzh/kmtalk
2024-09-09, KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation, Hoang-Son Vo-Thanh et.al., Paper: http://arxiv.org/abs/2409.05330, Code: https://github.com/sowwnn/KFusion-Dual-Domain-for-Speech-to-Landmarks
2024-01-11, Jump Cut Smoothing for Talking Heads, Xiaojuan Wang et.al., Paper: http://arxiv.org/abs/2401.04718
2024-11-20, JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation, Xuyang Cao et.al., Paper: http://arxiv.org/abs/2411.09209, Code: https://github.com/jdh-algo/JoyVASA
2025-01-03, JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing, Qili Wang et.al., Paper: http://arxiv.org/abs/2501.01798, Code: https://github.com/JOY-MM/JoyGen
2024-10-21, Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions, Malte Prinzler et.al., Paper: http://arxiv.org/abs/2410.16395
2025-01-15, Joint Learning of Depth and Appearance for Portrait Image Animation, Xinya Ji et.al., Paper: http://arxiv.org/abs/2501.08649
2024-12-18, Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters, Steven Hogue et.al., Paper: http://arxiv.org/abs/2412.14333, Code: https://github.com/ditzley/joint-gestures-and-face
2021-12-07, Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation, Yingruo Fan et.al., Paper: http://arxiv.org/abs/2112.02214
2024-08-03, JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model, Farzaneh Jafari et.al., Paper: http://arxiv.org/abs/2408.01627
2024-09-18, JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation, Sai Tanmay Reddy Chakkera et.al., Paper: http://arxiv.org/abs/2409.12156
2021-10-22, Invertible Frowns: Video-to-Video Facial Emotion Translation, Ian Magnusson et.al., Paper: http://arxiv.org/abs/2109.08061
2020-10-12, Intuitive Facial Animation Editing Based On A Generative RNN Framework, Eloïse Berson et.al., Paper: http://arxiv.org/abs/2010.05655
2023-07-05, Interactive Conversational Head Generation, Mohan Zhou et.al., Paper: http://arxiv.org/abs/2307.02090
2021-10-16, Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor, Anchit Gupta et.al., Paper: http://arxiv.org/abs/2110.08580
2024-05-24, InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation, Yuchi Wang et.al., Paper: http://arxiv.org/abs/2405.15758, Code: https://github.com/wangyuchi369/InstructAvatar
2023-06-05, Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions, Shaoxu Li et.al., Paper: http://arxiv.org/abs/2306.02903, Code: https://github.com/lsx0101/instruct-video2avatar
2023-08-16, Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions, Yuqi Sun et.al., Paper: http://arxiv.org/abs/2306.10813
2021-12-19, Initiative Defense against Facial Manipulation, Qidong Huang et.al., Paper: http://arxiv.org/abs/2112.10098, Code: https://github.com/shikiw/initiative-defense-for-deepfake
2018-11-16, Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters, Maartje M. E. Hendrikse et.al., Paper: http://arxiv.org/abs/1812.02088
2023-03-09, Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation, Qi Chen et.al., Paper: http://arxiv.org/abs/2303.05322, Code: https://github.com/moon0316/t2a
2016-05-22, Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression, David Rim et.al., Paper: http://arxiv.org/abs/1512.08212
2024-01-26, Implicit Neural Representation for Physics-driven Actuated Soft Bodies, Lingchen Yang et.al., Paper: http://arxiv.org/abs/2401.14861
2023-04-21, Implicit Neural Head Synthesis via Controllable Local Deformation Fields, Chuhan Chen et.al., Paper: http://arxiv.org/abs/2304.11113
2023-08-18, Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation, Fa-Ting Hong et.al., Paper: http://arxiv.org/abs/2307.09906, Code: https://github.com/harlanhong/iccv2023-mcnet
2022-12-30, Imitator: Personalized Speech-driven 3D Facial Animation, Balamurugan Thambiraja et.al., Paper: http://arxiv.org/abs/2301.00023
2021-10-30, Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis, Haozhe Wu et.al., Paper: http://arxiv.org/abs/2111.00203, Code: https://github.com/wuhaozhe/style_avatar
2025-01-09, Identity-Preserving Video Dubbing Using Motion Warping, Runzhen Liu et.al., Paper: http://arxiv.org/abs/2501.04586
2023-05-15, Identity-Preserving Talking Face Generation with Landmark and Appearance Priors, Weizhi Zhong et.al., Paper: http://arxiv.org/abs/2305.08293, Code: https://github.com/Weizhi-Zhong/IP_LAP
2020-05-25, Identity-Preserving Realistic Talking Face Generation, Sanjana Sinha et.al., Paper: http://arxiv.org/abs/2005.12318
2023-05-17, INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network, Shuang Chen et.al., Paper: http://arxiv.org/abs/2305.10589
2024-12-10, IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation, Sejong Yang et.al., Paper: http://arxiv.org/abs/2412.04000
2020-01-17, ICface: Interpretable and Controllable Face Reenactment Using GANs, Soumya Tripathy et.al., Paper: http://arxiv.org/abs/1904.01909
2023-07-20, HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces, Stella Bounareli et.al., Paper: http://arxiv.org/abs/2307.10797, Code: https://github.com/stelabou/hyperreenact
2023-10-15, HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation, Yaosen Chen et.al., Paper: http://arxiv.org/abs/2310.05720, Code: https://github.com/semchan/HyperLips
2024-08-10, High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model, Weizhi Zhong et.al., Paper: http://arxiv.org/abs/2408.05416
2023-05-31, High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning, Chao Xu et.al., Paper: http://arxiv.org/abs/2305.02572
2023-03-04, High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors, Yunpeng Bai et.al., Paper: http://arxiv.org/abs/2211.15064
2023-11-02, High-Fidelity and Freely Controllable Talking Head Video Generation, Yue Gao et.al., Paper: http://arxiv.org/abs/2304.10168
2020-03-26, High-Accuracy Facial Depth Models derived from 3D Synthetic Data, Faisal Khan et.al., Paper: http://arxiv.org/abs/2003.06211
2023-07-19, Hierarchical Semantic Perceptual Listener Head Video Generation: A High-performance Pipeline, Zhigang Chang et.al., Paper: http://arxiv.org/abs/2307.09821
2019-05-09, Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss, Lele Chen et.al., Paper: http://arxiv.org/abs/1905.03820, Code: https://github.com/lelechen63/ATVGnet
2021-08-23, HeadGAN: One-shot Neural Head Synthesis and Editing, Michail Christos Doukas et.al., Paper: http://arxiv.org/abs/2012.08261
2020-05-22, Head2Head: Video-based Neural Head Synthesis, Mohammad Rami Koujan et.al., Paper: http://arxiv.org/abs/2005.10954
2024-06-16, Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation, Mingwang Xu et.al., Paper: http://arxiv.org/abs/2406.08801
2024-12-05, Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks, Jiahao Cui et.al., Paper: http://arxiv.org/abs/2412.00733, Code: https://github.com/fudan-generative-vision/hallo3
2023-09-14, HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods, Yongyuan Li et.al., Paper: http://arxiv.org/abs/2309.07495, Code: https://github.com/yylgoodlucky/hdtr
2024-04-07, GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets, Dongjing Shan et.al., Paper: http://arxiv.org/abs/2404.04924
2024-12-13, GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expression, Ziqi Zhou et.al., Paper: http://arxiv.org/abs/2412.09296, Code: https://github.com/Jia1018/GoHD
2023-10-08, GestSync: Determining who is speaking without a talking head, Sindhu B Hegde et.al., Paper: http://arxiv.org/abs/2310.05304, Code: https://github.com/Sindhu-Hegde/gestsync
2024-10-14, Generative Human Video Compression with Multi-granularity Temporal Trajectory Factorization, Shanzhi Yin et.al., Paper: http://arxiv.org/abs/2410.10171
2018-03-28, Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network, Hai X. Pham et.al., Paper: http://arxiv.org/abs/1803.07716
2025-01-07, Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools, Arash Dehghani et.al., Paper: http://arxiv.org/abs/2501.06227
2018-04-23, Generating Talking Face Landmarks from Speech, Sefik Emre Eskimez et.al., Paper: http://arxiv.org/abs/1803.09803
2024-12-26, Generating Editable Head Avatars with 3D Gaussian GANs, Guohao Li et.al., Paper: http://arxiv.org/abs/2412.19149, Code: https://github.com/liguohao96/egg3d
2023-07-04, Generating Animatable 3D Cartoon Faces from Single Portraits, Chuanyu Pan et.al., Paper: http://arxiv.org/abs/2307.01468
2023-01-31, GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis, Zhenhui Ye et.al., Paper: http://arxiv.org/abs/2301.13430
2023-05-01, GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation, Zhenhui Ye et.al., Paper: http://arxiv.org/abs/2305.00787
2023-10-19, Gemino: Practical and Robust Neural Compression for Video Conferencing, Vibhaalakshmi Sivaraman et.al., Paper: http://arxiv.org/abs/2209.10507
2024-04-28, GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting, Hongyun Yu et.al., Paper: http://arxiv.org/abs/2404.14037
2024-04-25, GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting, Kyusun Cho et.al., Paper: http://arxiv.org/abs/2404.16012, Code: https://github.com/ku-cvlab/gaussiantalker
2024-09-18, GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations, Kartik Teotia et.al., Paper: http://arxiv.org/abs/2409.11951
2023-12-19, Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing, Yushi Lan et.al., Paper: http://arxiv.org/abs/2312.03763
2016-10-28, Galaxy gas as obscurer: II. Separating the galaxy-scale and nuclear obscurers of Active Galactic Nuclei, Johannes Buchner et.al., Paper: http://arxiv.org/abs/1610.09380, Code: https://github.com/JohannesBuchner/LightRayRider
2023-12-12, GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance, Haiming Zhang et.al., Paper: http://arxiv.org/abs/2312.07385
2024-04-29, GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting, Bo Chen et.al., Paper: http://arxiv.org/abs/2404.19040
2024-03-28, GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response, Govind Mittal et.al., Paper: http://arxiv.org/abs/2210.06186, Code: https://github.com/mittalgovind/GOTCHA-Deepfakes
2023-12-12, GMTalker: Gaussian Mixture based Emotional talking video Portraits, Yibo Xia et.al., Paper: http://arxiv.org/abs/2312.07669
2024-08-16, GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer, Yihong Lin et.al., Paper: http://arxiv.org/abs/2408.01826
2024-12-18, GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection, Xiaocan Chen et.al., Paper: http://arxiv.org/abs/2412.13656
2018-08-28, GANimation: Anatomically-aware Facial Animation from a Single Image, Albert Pumarola et.al., Paper: http://arxiv.org/abs/1807.09251, Code: https://github.com/albertpumarola/GANimation
2024-03-14, GAIA: Zero-shot Talking Avatar Generation, Tianyu He et.al., Paper: http://arxiv.org/abs/2311.15230
2024-03-02, G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment, Juan Zhang et.al., Paper: http://arxiv.org/abs/2402.18122
2024-08-23, G3FA: Geometry-guided GAN for Face Animation, Alireza Javanmardi et.al., Paper: http://arxiv.org/abs/2408.13049
2023-08-30, From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications, Shreyank N Gowda et.al., Paper: http://arxiv.org/abs/2308.16041
2024-01-07, Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness, Sicheng Yang et.al., Paper: http://arxiv.org/abs/2401.03476
2024-10-09, FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model, Feng Qiu et.al., Paper: http://arxiv.org/abs/2409.13180
2022-08-03, Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control, Michail Christos Doukas et.al., Paper: http://arxiv.org/abs/2208.02210
2024-06-07, Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation, Yue Ma et.al., Paper: http://arxiv.org/abs/2406.01900
2024-03-12, FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization, Shuai Tan et.al., Paper: http://arxiv.org/abs/2403.06375
2021-10-12, Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment, Haichao Zhang et.al., Paper: http://arxiv.org/abs/2110.04708
2022-10-06, Finding Directions in GAN's Latent Space for Neural Face Reenactment, Stella Bounareli et.al., Paper: http://arxiv.org/abs/2202.00046, Code: https://github.com/stelabou/stylegan_directions_face_reenactment
2019-10-28, Few-shot Video-to-Video Synthesis, Ting-Chun Wang et.al., Paper: http://arxiv.org/abs/1910.12713
2019-09-25, Few-Shot Adversarial Learning of Realistic Neural Talking Head Models, Egor Zakharov et.al., Paper: http://arxiv.org/abs/1905.08233
2024-09-24, FastTalker: Jointly Generating Speech and Conversational Gestures from Text, Zixin Guo et.al., Paper: http://arxiv.org/abs/2409.16404
2022-07-13, FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis, Yongqi Wang et.al., Paper: http://arxiv.org/abs/2207.03800
2024-01-19, Fast Registration of Photorealistic Avatars for VR Facial Animation, Chaitanya Patel et.al., Paper: http://arxiv.org/abs/2401.11002
2022-04-25, Fast Facial Landmark Detection and Applications: A Survey, Kostiantyn Khabarlak et.al., Paper: http://arxiv.org/abs/2101.10808
2017-07-26, Fast Deep Matting for Portrait Animation on Mobile Phone, Bingke Zhu et.al., Paper: http://arxiv.org/abs/1707.08289
2022-03-01, FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset, Hasam Khalid et.al., Paper: http://arxiv.org/abs/2108.05080, Code: https://github.com/dash-lab/fakeavceleb
2022-09-29, Facial Landmark Predictions with Applications to Metaverse, Qiao Han et.al., Paper: http://arxiv.org/abs/2209.14698, Code: https://github.com/sweatybridge/text-to-anime
2020-11-02, Facial Keypoint Sequence Generation from Audio, Prateek Manocha et.al., Paper: http://arxiv.org/abs/2011.01114
2024-05-16, Faces that Speak: Jointly Synthesising Talking Face and Speech from Text, Youngjoon Jang et.al., Paper: http://arxiv.org/abs/2405.10272
2023-03-09, FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning, Kazi Injamamul Haque et.al., Paper: http://arxiv.org/abs/2303.05416, Code: https://github.com/galib360/facexhubert
2024-09-23, FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset, Donglin Di et.al., Paper: http://arxiv.org/abs/2410.07151
2024-12-23, FaceLift: Single Image to 3D Head with View Generation and GS-LRM, Weijie Lyu et.al., Paper: http://arxiv.org/abs/2412.17812
2022-03-17, FaceFormer: Speech-Driven 3D Facial Animation with Transformers, Yingruo Fan et.al., Paper: http://arxiv.org/abs/2112.05329, Code: https://github.com/EvelynFan/FaceFormer
2023-09-20, FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion, Stefan Stan et.al., Paper: http://arxiv.org/abs/2309.11306, Code: https://github.com/uuembodiedsocialai/FaceDiffuser
2024-04-01, FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio, Chao Xu et.al., Paper: http://arxiv.org/abs/2403.01901, Code: https://github.com/modelscope/facechain
2022-06-09, Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos, Alexander Waibel et.al., Paper: http://arxiv.org/abs/2206.04523
2012-03-30, Face Expression Recognition and Analysis: The State of the Art, Vinay Bettadapura et.al., Paper: http://arxiv.org/abs/1203.6722
2023-04-06, Face Animation with an Attribute-Guided Diffusion Model, Bohan Zeng et.al., Paper: http://arxiv.org/abs/2304.03199, Code: https://github.com/zengbohan0217/fadm
2024-05-21, Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control, Yue Han et.al., Paper: http://arxiv.org/abs/2405.12970
2020-05-13, FaR-GAN for One-Shot Face Reenactment, Hanxiang Hao et.al., Paper: http://arxiv.org/abs/2005.06402
2023-07-08, FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction, Ganglai Wang et.al., Paper: http://arxiv.org/abs/2307.03990
2023-12-09, FT2TF: First-Person Statement Text-To-Talking Face Generation, Xingjian Diao et.al., Paper: http://arxiv.org/abs/2312.05430
2024-04-15, FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features, Andre Rochow et.al., Paper: http://arxiv.org/abs/2404.09736
2022-02-25, FSGANv2: Improved Subject Agnostic Face Swapping and Reenactment, Yuval Nirkin et.al., Paper: http://arxiv.org/abs/2202.12972
2019-08-16, FSGAN: Subject Agnostic Face Swapping and Reenactment, Yuval Nirkin et.al., Paper: http://arxiv.org/abs/1908.05932, Code: https://github.com/YuvalNirkin/fsgan
2020-05-16, FReeNet: Multi-Identity Face Reenactment, Jiangning Zhang et.al., Paper: http://arxiv.org/abs/1905.11805
2023-03-31, FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions, Jin Liu et.al., Paper: http://arxiv.org/abs/2303.17789
2022-09-21, FNeVR: Neural Volume Rendering for Face Animation, Bohan Zeng et.al., Paper: http://arxiv.org/abs/2209.10340, Code: https://github.com/zengbohan0217/FNeVR
2019-11-21, FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis, Kuangxiao Gu et.al., Paper: http://arxiv.org/abs/1911.09224
2019-04-02, FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation, Yanfu Yan et.al., Paper: http://arxiv.org/abs/1904.01509
2021-11-04, FEAFA+: An Extended Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation, Wei Gan et.al., Paper: http://arxiv.org/abs/2111.02751
2024-08-18, FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model, Ziyu Yao et.al., Paper: http://arxiv.org/abs/2408.09384
2024-12-22, FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation, Tianyun Zhong et.al., Paper: http://arxiv.org/abs/2412.16915
2023-07-18, FACTS: Facial Animation Creation using the Transfer of Styles, Jack Saunders et.al., Paper: http://arxiv.org/abs/2307.09480
2021-08-18, FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning, Chenxu Zhang et.al., Paper: http://arxiv.org/abs/2108.07938, Code: https://github.com/zhangchenxu528/FACIAL
2020-11-09, FACEGAN: Facial Attribute Controllable rEenactment GAN, Soumya Tripathy et.al., Paper: http://arxiv.org/abs/2011.04439
2023-12-20, FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability, Linze Li et.al., Paper: http://arxiv.org/abs/2312.03775
2022-08-17, Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors, Sindhu B Hegde et.al., Paper: http://arxiv.org/abs/2208.08118, Code: https://github.com/Sindhu-Hegde/video-super-resolver
2022-11-30, Extracting Semantic Knowledge from GANs with Unsupervised Learning, Jianjin Xu et.al., Paper: http://arxiv.org/abs/2211.16710
2023-02-14, Expressive Talking Head Video Encoding in StyleGAN2 Latent-Space, Trevine Oorloff et.al., Paper: http://arxiv.org/abs/2203.14512, Code: https://github.com/trevineoorloff/Encode-in-Style
2024-01-04, Expressive Speech-driven Facial Animation with controllable emotions, Yutong Chen et.al., Paper: http://arxiv.org/abs/2301.02008, Code: https://github.com/on1262/facialanimation
2015-11-20, ExpressionBot: An Emotive Lifelike Robotic Face for Face-to-Face Communication, Ali Mollahosseini et.al., Paper: http://arxiv.org/abs/1511.06502
2024-01-18, Exposing Lip-syncing Deepfakes from Mouth Inconsistencies, Soumyya Kanti Datta et.al., Paper: http://arxiv.org/abs/2401.10113
2024-04-01, Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation, Se Jin Park et.al., Paper: http://arxiv.org/abs/2305.19556
2023-09-11, ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment, Yicheng Zhong et.al., Paper: http://arxiv.org/abs/2308.14448
2021-04-07, Everything's Talkin': Pareidolia Face Reenactment, Linsen Song et.al., Paper: http://arxiv.org/abs/2104.03061, Code: https://github.com/Linsen13/EverythingTalking
2021-03-03, Estimating Uniqueness of I-Vector Representation of Human Voice, Erkam Sinan Tandogan et.al., Paper: http://arxiv.org/abs/2008.11985
2024-07-01, Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert, Han EunGi et.al., Paper: http://arxiv.org/abs/2407.01034
2017-12-07, End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech, Hai X. Pham et.al., Paper: http://arxiv.org/abs/1710.00920
2018-07-19, End-to-End Speech-Driven Facial Animation with Temporal GANs, Konstantinos Vougioukas et.al., Paper: http://arxiv.org/abs/1805.09313
2021-03-19, End-to-End Lip Synchronisation Based on Pattern Classification, You Jin Kim et.al., Paper: http://arxiv.org/abs/2005.08606
2022-03-30, End to End Lip Synchronization with a Temporal AutoEncoder, Yoav Shalev et.al., Paper: http://arxiv.org/abs/2203.16224, Code: https://github.com/itsyoavshalev/end-to-end-lip-synchronization-with-a-temporal-autoencoder
2024-06-21, EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot, Hao Fei et.al., Paper: http://arxiv.org/abs/2406.15177, Code: https://github.com/scofield7419/empathyear
2024-11-23, EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion, Haotian Wang et.al., Paper: http://arxiv.org/abs/2411.16726
2023-03-26, Emotionally Enhanced Talking Face Generation, Sahil Goyal et.al., Paper: http://arxiv.org/abs/2303.11548, Code: https://github.com/sahilg06/EmoGen
2023-06-06, Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks, Jianrong Wang et.al., Paper: http://arxiv.org/abs/2306.03594
2023-09-26, Emotional Speech-Driven Animation with Content-Emotion Disentanglement, Radek Daněček et.al., Paper: http://arxiv.org/abs/2306.08990
2024-06-12, Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation, Jiadong Liang et.al., Paper: http://arxiv.org/abs/2406.07895
2022-05-02, Emotion-Controllable Generalized Talking Face Generation, Sanjana Sinha et.al., Paper: http://arxiv.org/abs/2205.01155
2021-10-26, Emotion recognition in talking-face videos using persistent entropy and neural networks, Eduardo Paluzo-Hidalgo et.al., Paper: http://arxiv.org/abs/2110.13571, Code: https://github.com/cimagroup/audiovisual-emotionrecognitionusingtda
2019-08-11, Emotion Dependent Facial Animation from Affective Speech, Rizwan Sadiq et.al., Paper: http://arxiv.org/abs/1908.03904
2024-03-19, EmoVOCA: Speech-Driven Emotional 3D Talking Heads, Federico Nocentini et.al., Paper: http://arxiv.org/abs/2403.12886
2024-01-16, EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model, Bingyuan Zhang et.al., Paper: http://arxiv.org/abs/2401.08049
2023-08-25, EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation, Ziqiao Peng et.al., Paper: http://arxiv.org/abs/2303.11089, Code: https://github.com/psyai-net/EmoTalk_release
2024-08-01, EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head, Qianyun He et.al., Paper: http://arxiv.org/abs/2408.00297
2024-02-02, EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation, Guanwen Feng et.al., Paper: http://arxiv.org/abs/2402.01422
2024-08-21, EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention, Yihong Lin et.al., Paper: http://arxiv.org/abs/2408.11518
2024-07-17, EmoFace: Audio-driven Emotional 3D Face Animation, Chang Liu et.al., Paper: http://arxiv.org/abs/2407.12501, Code: https://github.com/sjtu-lucy/emoface
2024-12-12, EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing, Gaoxiang Cong et.al., Paper: http://arxiv.org/abs/2412.08988
2019-10-09, EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos, Haipeng Zeng et.al., Paper: http://arxiv.org/abs/1907.12918
2024-04-29, Embedded Representation Learning Network for Animating Styled Video Portrait, Tianyong Wang et.al., Paper: http://arxiv.org/abs/2404.19038
2021-07-07, Egocentric Videoconferencing, Mohamed Elgharib et.al., Paper: http://arxiv.org/abs/2107.03109
2022-03-16, Efficient conditioned face animation using frontally-viewed embedding, Maxime Oquab et.al., Paper: http://arxiv.org/abs/2203.08765
2023-08-24, Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis, Jiahe Li et.al., Paper: http://arxiv.org/abs/2307.09323, Code: https://github.com/fictionarry/er-nerf
2023-10-12, Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation, Yuan Gan et.al., Paper: http://arxiv.org/abs/2309.04946, Code: https://github.com/yuangan/eat_code
2024-07-12, EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions, Zhiyuan Chen et.al., Paper: http://arxiv.org/abs/2407.08136
2024-11-25, ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations, Xulong Zhang et.al., Paper: http://arxiv.org/abs/2411.13089
2024-09-11, EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion, Jian Zhang et.al., Paper: http://arxiv.org/abs/2409.07255
2024-04-29, EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars, Nikita Drobyshev et.al., Paper: http://arxiv.org/abs/2404.19110
2024-02-27, EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions, Linrui Tian et.al., Paper: http://arxiv.org/abs/2402.17485
2025-01-18, EMO2: End-Effector Guided Audio-Driven Avatar Video Generation, Linrui Tian et.al., Paper: http://arxiv.org/abs/2501.10687
2024-04-11, EFHQ: Multi-purpose ExtremePose-Face-HQ dataset, Trung Tuan Dao et.al., Paper: http://arxiv.org/abs/2312.17205
2024-04-02, EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis, Shuai Tan et.al., Paper: http://arxiv.org/abs/2404.01647
2018-08-19, Dynamic Temporal Alignment of Speech to Lips, Tavi Halperin et.al., Paper: http://arxiv.org/abs/1808.06250, Code: https://github.com/tavihalperin/AV-sync
2022-04-13, Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions, Zipeng Ye et.al., Paper: http://arxiv.org/abs/2204.06180
2020-10-05, Dynamic Facial Asset and Rig Generation from a Single Scan, Jiaman Li et.al., Paper: http://arxiv.org/abs/2010.00560
2022-12-23, Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing, William Brannon et.al., Paper: http://arxiv.org/abs/2212.12137
2024-01-11, Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors, Jack Saunders et.al., Paper: http://arxiv.org/abs/2401.06126
2024-06-13, DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing, Neha Sahipjohn et.al., Paper: http://arxiv.org/abs/2406.08802
2023-11-13, DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation, Guinan Su et.al., Paper: http://arxiv.org/abs/2311.04766
2020-09-12, DualLip: A System for Joint Lip Reading and Generation, Weicong Chen et.al., Paper: http://arxiv.org/abs/2009.05784
2023-12-15, DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models, Yifeng Ma et.al., Paper: http://arxiv.org/abs/2312.09767
2024-09-16, DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis, Fa-Ting Hong et.al., Paper: http://arxiv.org/abs/2409.10281
2023-04-01, DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance, Longwen Zhang et.al., Paper: http://arxiv.org/abs/2304.03117
2024-09-27, Diverse Code Query Learning for Speech-Driven Facial Animation, Chunzhi Gu et.al., Paper: http://arxiv.org/abs/2409.19143
2024-11-29, Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis, Tianqi Li et.al., Paper: http://arxiv.org/abs/2411.19509
2023-03-26, Distributed Solution of the Inverse Rig Problem in Blendshape Facial Animation, Stevo Racković et.al., Paper: http://arxiv.org/abs/2303.06370
2019-12-20, Disentangling Style and Content in Anime Illustrations, Sitao Xiang et.al., Paper: http://arxiv.org/abs/1905.10742
2020-05-04, Disentangled Speech Embeddings using Cross-modal Self-supervision, Arsha Nagrani et.al., Paper: http://arxiv.org/abs/2002.08742
2023-03-14, DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions, Geumbyeol Hwang et.al., Paper: http://arxiv.org/abs/2303.07697, Code: https://github.com/deepbrainai-research/koeba
2023-12-02, DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser, Peng Chen et.al., Paper: http://arxiv.org/abs/2311.16565
2024-03-25, DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment, Stella Bounareli et.al., Paper: http://arxiv.org/abs/2403.17217
2023-07-29, Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation, Michał Stypułkowski et.al., Paper: http://arxiv.org/abs/2301.03396
2023-09-14, DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks, Zipeng Qi et.al., Paper: http://arxiv.org/abs/2309.07509
2023-04-20, DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation, Shuai Shen et.al., Paper: http://arxiv.org/abs/2301.03786, Code: https://github.com/sstzal/DiffTalk
2024-09-11, DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures, Steven Hogue et.al., Paper: http://arxiv.org/abs/2409.07649
2024-02-08, DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer, Zhiyuan Ma et.al., Paper: http://arxiv.org/abs/2402.05712, Code: https://github.com/theericma/diffspeaker
2023-09-30, DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models, Zhiyao Sun et.al., Paper: http://arxiv.org/abs/2310.00434
2024-01-12, DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder, Tao Liu et.al., Paper: http://arxiv.org/abs/2311.01811
2023-08-18, Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization, Soumik Mukhopadhyay et.al., Paper: http://arxiv.org/abs/2308.09716, Code: https://github.com/soumik-kanad/diff2lip
2023-08-12, DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation, Yichao Yan et.al., Paper: http://arxiv.org/abs/2203.07931
2023-12-11, DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers, Aaron Mir et.al., Paper: http://arxiv.org/abs/2312.06400
2021-09-17, Detection of GAN-synthesized street videos, Omran Alamayreh et.al., Paper: http://arxiv.org/abs/2109.04991
2019-10-16, Designing Style Matching Conversational Agents, Deepali Aneja et.al., Paper: http://arxiv.org/abs/1910.07514
2022-03-15, Depth-Aware Generative Adversarial Network for Talking Head Video Generation, Fa-Ting Hong et.al., Paper: http://arxiv.org/abs/2203.06605, Code: https://github.com/harlanhong/cvpr2022-dagan
2020-07-20, Deformable Style Transfer, Sunnie S. Y. Kim et.al., Paper: http://arxiv.org/abs/2003.11038, Code: https://github.com/sunniesuhyoung/DST
2024-07-31, Deformable 3D Shape Diffusion Model, Dengsheng Chen et.al., Paper: http://arxiv.org/abs/2407.21428
2024-04-09, Deepfake Generation and Detection: A Benchmark and Survey, Gan Pei et.al., Paper: http://arxiv.org/abs/2403.17881, Code: https://github.com/flyingby/awesome-deepfake-generation-and-detection
2024-08-09, DeepSpeak Dataset v1.0, Sarah Barrington et.al., Paper: http://arxiv.org/abs/2408.05366
2018-12-20, DeepFakes: a New Threat to Face Recognition? Assessment and Detection, Pavel Korshunov et.al., Paper: http://arxiv.org/abs/1812.08685
2021-08-18, DeepFake MNIST+: A DeepFake Facial Animation Dataset, Jiajun Huang et.al., Paper: http://arxiv.org/abs/2108.07949, Code: https://github.com/huangjiadidi/DeepFakeMnist
2023-02-27, Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video, Minsu Kim et.al., Paper: http://arxiv.org/abs/2303.08670
2018-05-29, Deep Video Portraits, Hyeongwoo Kim et.al., Paper: http://arxiv.org/abs/1805.11714
2023-08-21, Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis, Tong Sha et.al., Paper: http://arxiv.org/abs/2109.02081
2020-08-02, Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos, Yanhui Guo et.al., Paper: http://arxiv.org/abs/2008.01652
2018-12-22, Deep Audio-Visual Speech Recognition, Triantafyllos Afouras et.al., Paper: http://arxiv.org/abs/1809.02108
2019-07-24, Data-Driven Physical Face Inversion, Yeara Kozlov et.al., Paper: http://arxiv.org/abs/1907.10402
2023-01-23, Data standardization for robust lip sync, Chun Wang et.al., Paper: http://arxiv.org/abs/2202.06198
2020-05-11, Dancing to the Partisan Beat: A First Analysis of Political Communication on TikTok, Juan Carlos Medina Serrano et.al., Paper: http://arxiv.org/abs/2004.05478, Code: https://github.com/JuanCarlosCSE/TikTok
2023-12-10, DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation, Fa-Ting Hong et.al., Paper: http://arxiv.org/abs/2305.06225, Code: https://github.com/harlanhong/cvpr2022-dagan
2023-09-14, DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis, Yaoyu Su et.al., Paper: http://arxiv.org/abs/2309.07752
2023-12-21, DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation, Chenxu Zhang et.al., Paper: http://arxiv.org/abs/2312.13578
2023-03-01, DPE: Disentanglement of Pose and Expression for General Video Portrait Editing, Youxin Pang et.al., Paper: http://arxiv.org/abs/2301.06281, Code: https://github.com/Carlyx/DPE
2024-06-14, DNPM: A Neural Parametric Model for the Synthesis of Facial Geometric Details, Haitao Cao et.al., Paper: http://arxiv.org/abs/2405.19688
2023-03-07, DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video, Zhimeng Zhang et.al., Paper: http://arxiv.org/abs/2303.03988, Code: https://github.com/MRzzm/DINet
2022-01-03, DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering, Shunyu Yao et.al., Paper: http://arxiv.org/abs/2201.00791
2024-06-19, DF40: Toward Next-Generation Deepfake Detection, Zhiyuan Yan et.al., Paper: http://arxiv.org/abs/2406.13495
2023-09-12, DF-TransFusion: Multimodal Deepfake Detection via Lip-Audio Cross-Attention and Facial Self-Attention, Aaditya Kharel et.al., Paper: http://arxiv.org/abs/2309.06511
2023-08-23, DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion, Se Jin Park et.al., Paper: http://arxiv.org/abs/2310.05934
2024-12-28, DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis, Kaijun Deng et.al., Paper: http://arxiv.org/abs/2412.20148, Code: https://github.com/cvi-szu/degstalk
2024-08-20, DEGAS: Detailed Expressions on Full-Body Gaussian Avatars, Zhijing Shao et.al., Paper: http://arxiv.org/abs/2408.10588
2024-08-12, DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation, Jisoo Kim et.al., Paper: http://arxiv.org/abs/2408.06010
2024-10-18, DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation, Hanbo Cheng et.al., Paper: http://arxiv.org/abs/2410.13726, Code: https://github.com/hanbo-cheng/dawn-pytorch
2024-03-01, DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder, Chenpeng Du et.al., Paper: http://arxiv.org/abs/2303.17550
2023-03-05, Cyber Vaccine for Deepfake Immunity, Ching-Chun Chang et.al., Paper: http://arxiv.org/abs/2303.02659
2022-06-29, Cut Inner Layers: A Structured Pruning Strategy for Efficient U-Net GANs, Bo-Kyeong Kim et.al., Paper: http://arxiv.org/abs/2206.14658
2023-10-17, CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation, Zhaojie Chu et.al., Paper: http://arxiv.org/abs/2310.11295
2024-06-05, Controllable Talking Face Generation by Implicit Facial Keypoints Editing, Dong Zhao et.al., Paper: http://arxiv.org/abs/2406.02880
2023-04-27, Controllable One-Shot Face Video Synthesis With Semantic Aware Prior, Kangning Liu et.al., Paper: http://arxiv.org/abs/2304.14471
2023-11-28, Continuously Controllable Facial Expression Editing in Talking Face Videos, Zhiyao Sun et.al., Paper: http://arxiv.org/abs/2209.08289
2024-02-28, Context-aware Talking Face Video Generation, Meidai Xuanyuan et.al., Paper: http://arxiv.org/abs/2402.18092
2023-09-20, Context-Aware Talking-Head Video Editing, Songlin Yang et.al., Paper: http://arxiv.org/abs/2308.00462
2024-08-14, Content and Style Aware Audio-Driven Facial Animation, Qingju Liu et.al., Paper: http://arxiv.org/abs/2408.07005
2024-11-23, ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance, Haijie Yang et.al., Paper: http://arxiv.org/abs/2411.15436
2022-10-07, Compressing Video Calls using Synthetic Talking Heads, Madhav Agarwal et.al., Paper: http://arxiv.org/abs/2210.03692
2024-07-04, Compressed Skinning for Facial Blendshapes, Ladislav Kavan et.al., Paper: http://arxiv.org/abs/2406.11597
2024-11-20, Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis, Pegah Salehi et.al., Paper: http://arxiv.org/abs/2411.13209, Code: https://github.com/pegahs1993/whisper-afe-talkingheadsgen
2023-04-03, CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior, Jinbo Xing et.al., Paper: http://arxiv.org/abs/2301.02379, Code: https://github.com/Doubiiu/CodeTalker
2023-10-12, CleftGAN: Adapting A Style-Based Generative Adversarial Network To Create Images Depicting Cleft Lip Deformity, Abdullah Hayajneh et.al., Paper: http://arxiv.org/abs/2310.07969, Code: https://github.com/abdullah-tamu/CleftGAN
2023-11-12, ChatAnything: Facetime Chat with LLM-Enhanced Personas, Yilin Zhao et.al., Paper: http://arxiv.org/abs/2311.06772
2024-10-14, Character-aware audio-visual subtitling in context, Jaesung Huh et.al., Paper: http://arxiv.org/abs/2410.11068
1998-07-31, Character design for soccer commmentary, Kim Binsted et.al., Paper: http://arxiv.org/abs/cmp-lg/9807012
2019-05-08, Capture, Learning, and Synthesis of 3D Speaking Styles, Daniel Cudeiro et.al., Paper: http://arxiv.org/abs/1905.03079, Code: https://github.com/TimoBolkart/voca
2024-04-29, CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation, Xiangyu Liang et.al., Paper: http://arxiv.org/abs/2404.18604
2023-05-23, CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation, Jingning Xu et.al., Paper: http://arxiv.org/abs/2305.13962
2023-11-15, CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding, Jianzong Wang et.al., Paper: http://arxiv.org/abs/2311.08673
2024-02-21, Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters, Zechen Bai et.al., Paper: http://arxiv.org/abs/2402.13724, Code: https://github.com/showlab/byoc
2023-10-31, Breathing Life into Faces: Speech-driven 3D Facial Animation with Natural Head Pose and Detailed Shape, Wei Zhao et.al., Paper: http://arxiv.org/abs/2310.20240
2021-11-02, BiosecurID: a multimodal biometric database, Julian Fierrez et.al., Paper: http://arxiv.org/abs/2111.03472
2021-07-27, Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations, Laurent Benaroya et.al., Paper: http://arxiv.org/abs/2107.12346
2024-10-14, Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads, Federico Nocentini et.al., Paper: http://arxiv.org/abs/2410.11041
2023-11-28, BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis, Hao-Bin Duan et.al., Paper: http://arxiv.org/abs/2311.05521, Code: https://github.com/buaavrcg/BakedAvatar
2023-09-12, Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos, Ekta Prashnani et.al., Paper: http://arxiv.org/abs/2305.03713
2023-04-17, Autoregressive GAN for Semantic Unconditional Head Motion Generation, Louis Airale et.al., Paper: http://arxiv.org/abs/2211.00987, Code: https://github.com/louisbearing/unconditionalheadmotion
2016-02-08, Automatic Face Reenactment, Pablo Garrido et.al., Paper: http://arxiv.org/abs/1602.02651
2022-09-19, AutoLV: Automatic Lecture Video Generator, Wenbin Wang et.al., Paper: http://arxiv.org/abs/2209.08795
2024-08-21, AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition, Minheng Ni et.al., Paper: http://arxiv.org/abs/2408.11564
2021-08-30, Audiovisual Speech Synthesis using Tacotron2, Ahmed Hussen Abdelaziz et.al., Paper: http://arxiv.org/abs/2008.00620
2021-02-18, AudioVisual Speech Synthesis: A brief literature review, Efthymios Georgiou et.al., Paper: http://arxiv.org/abs/2103.03927
2023-04-25, AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head, Rongjie Huang et.al., Paper: http://arxiv.org/abs/2304.12995, Code: https://github.com/aigc-audio/audiogpt
2024-05-30, Audio2Rig: Artist-oriented deep learning tool for facial animation, Bastien Arcelin et.al., Paper: http://arxiv.org/abs/2405.20412
2021-07-20, Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion, Suzhen Wang et.al., Paper: http://arxiv.org/abs/2107.09293, Code: https://github.com/wangsuzhen/Audio2Head
2019-05-27, Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks, Guanzhong Tian et.al., Paper: http://arxiv.org/abs/1905.11142
2020-03-05, Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose, Ran Yi et.al., Paper: http://arxiv.org/abs/2002.10137, Code: https://github.com/yiranran/Audio-driven-TalkingFace-HeadPose
2023-12-11, Audio-driven Talking Face Generation by Overcoming Unintended Information Flow, Dogucan Yaman et.al., Paper: http://arxiv.org/abs/2307.09368
2024-07-08, Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN, Jiacheng Su et.al., Paper: http://arxiv.org/abs/2407.05577
2024-05-08, Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention, Ruijie Tao et.al., Paper: http://arxiv.org/abs/2404.18501
2024-05-07, Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation, Dogucan Yaman et.al., Paper: http://arxiv.org/abs/2405.04327
2023-05-18, Audio-Visual Person-of-Interest DeepFake Detection, Davide Cozzolino et.al., Paper: http://arxiv.org/abs/2204.03083, Code: https://github.com/grip-unina/poi-forensics
2022-10-06, Audio-Visual Face Reenactment, Madhav Agarwal et.al., Paper: http://arxiv.org/abs/2210.02755, Code: https://github.com/mdv3101/AVFR-Gan
2023-09-15, Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech, Junjie Li et.al., Paper: http://arxiv.org/abs/2309.08408, Code: https://github.com/mrjunjieli/activeextract
2022-01-16, Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels, Zipeng Ye et.al., Paper: http://arxiv.org/abs/2201.05986
2023-04-18, Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations, Rongliang Wu et.al., Paper: http://arxiv.org/abs/2304.08945
2021-05-20, Audio-Driven Emotional Video Portraits, Xinya Ji et.al., Paper: http://arxiv.org/abs/2104.07452
2023-06-20, Audio-Driven 3D Facial Animation from In-the-Wild Videos, Liying Lu et.al., Paper: http://arxiv.org/abs/2306.11541
2020-08-11, Audio- and Gaze-driven Facial Animation of Codec Avatars, Alexander Richard et.al., Paper: http://arxiv.org/abs/2008.05023
2023-12-15, Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars, Andre Rochow et.al., Paper: http://arxiv.org/abs/2312.09750
2022-03-08, Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild, Ganglai Wang et.al., Paper: http://arxiv.org/abs/2203.03984
2020-05-13, Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning, Hao Zhu et.al., Paper: http://arxiv.org/abs/1812.06589
2021-08-11, AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person, Xinsheng Wang et.al., Paper: http://arxiv.org/abs/2108.04325
2018-05-21, Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks, Sitao Xiang et.al., Paper: http://arxiv.org/abs/1805.07997
2019-03-13, Animating an Autonomous 3D Talking Avatar, Dominik Borer et.al., Paper: http://arxiv.org/abs/1903.05448
2019-10-02, Animating Face using Disentangled Audio Representations, Gaurav Mittal et.al., Paper: http://arxiv.org/abs/1910.00726
2024-03-25, AnimateMe: 4D Facial Expressions via Diffusion Models, Dimitrios Gerogiannis et.al., Paper: http://arxiv.org/abs/2403.17213
2024-05-06, AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding, Tao Liu et.al., Paper: http://arxiv.org/abs/2405.03121, Code: https://github.com/x-lance/anitalker
2024-03-26, AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation, Huawei Wei et.al., Paper: http://arxiv.org/abs/2403.17694, Code: https://github.com/scutzzj/aniportrait
2023-06-13, AniFaceDrawing: Anime Portrait Exploration during Your Sketching, Zhengyu Huang et.al., Paper: http://arxiv.org/abs/2306.07476
2024-06-19, AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models, Ken Chen et.al., Paper: http://arxiv.org/abs/2406.13272
2024-07-21, Anchored Diffusion for Video Face Reenactment, Idan Kligvasser et.al., Paper: http://arxiv.org/abs/2407.15153
2020-09-20, An Improved Approach of Intention Discovery with Machine Learning for POMDP-based Dialogue Management, Ruturaj Raval et.al., Paper: http://arxiv.org/abs/2009.09354
2024-01-27, An Implicit Physical Face Model Driven by Expression and Style, Lingchen Yang et.al., Paper: http://arxiv.org/abs/2401.15414
2022-03-10, An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection, Ganglai Wang et.al., Paper: http://arxiv.org/abs/2203.05178
2023-05-18, An Android Robot Head as Embodied Conversational Agent, Marcel Heisler et.al., Paper: http://arxiv.org/abs/2305.10945
2022-12-28, All's well that FID's well? Result quality and metric scores in GAN models for lip-sychronization tasks, Carina Geldhauser et.al., Paper: http://arxiv.org/abs/2212.13810
2024-03-23, Adaptive Super Resolution For One-Shot Talking-Head Generation, Luchuan Song et.al., Paper: http://arxiv.org/abs/2403.15944, Code: https://github.com/songluchuan/adasr-talkinghead
2020-11-30, Adaptive Compact Attention For Few-shot Video-to-video Translation, Risheng Huang et.al., Paper: http://arxiv.org/abs/2011.14695
2024-01-08, AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation, Liyang Chen et.al., Paper: http://arxiv.org/abs/2310.07236
2023-08-02, Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis, Zhenhui Ye et.al., Paper: http://arxiv.org/abs/2306.03504
2020-03-30, ActGAN: Flexible and Efficient One-shot Face Reenactment, Ivan Kosarevych et.al., Paper: http://arxiv.org/abs/2003.13840
2021-09-20, Accurate, Interpretable, and Fast Animation: An Iterative, Sparse, and Nonconvex Approach, Stevo Rackovic et.al., Paper: http://arxiv.org/abs/2109.08356
2023-03-27, Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms, Stevo Racković et.al., Paper: http://arxiv.org/abs/2302.04843
2024-02-25, AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation, Yasheng Sun et.al., Paper: http://arxiv.org/abs/2402.16124
2020-10-25, APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment, Jiangning Zhang et.al., Paper: http://arxiv.org/abs/2010.13017, Code: https://github.com/zhangzjn/APB2FaceV2
2020-04-30, APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals, Jiangning Zhang et.al., Paper: http://arxiv.org/abs/2004.14569
2023-12-18, AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis, Dongze Li et.al., Paper: http://arxiv.org/abs/2312.10921
2021-08-19, AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis, Yudong Guo et.al., Paper: http://arxiv.org/abs/2103.11078, Code: https://github.com/YudongGuo/AD-NeRF
2019-07-23, A system for efficient 3D printed stop-motion face animation, Rinat Abdrashitov et.al., Paper: http://arxiv.org/abs/1907.10163
2023-04-28, A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation, Bo-Kyeong Kim et.al., Paper: http://arxiv.org/abs/2304.00471
2023-08-17, A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation, Li Liu et.al., Paper: http://arxiv.org/abs/2308.08849, Code: https://github.com/wentaol86/awesome-body-language
2020-07-18, A Robust Interactive Facial Animation Editing System, Eloïse Berson et.al., Paper: http://arxiv.org/abs/2007.09367
2022-05-02, A Novel Speech-Driven Lip-Sync Model with CNN and LSTM, Xiaohong Li et.al., Paper: http://arxiv.org/abs/2205.00916
2021-05-05, A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors, Ruobing Zheng et.al., Paper: http://arxiv.org/abs/2002.08700
2023-03-27, A Majorization-Minimization Based Method for Nonconvex Inverse Rig Problems in Facial Animation: Algorithm Derivation, Stevo Racković et.al., Paper: http://arxiv.org/abs/2205.04289
2020-08-23, A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild, K R Prajwal et.al., Paper: http://arxiv.org/abs/2008.10010, Code: https://github.com/Rudrabha/Wav2Lip
2025-01-21, A Lightweight and Interpretable Deepfakes Detection Framework, Muhammad Umar Farooq et.al., Paper: http://arxiv.org/abs/2501.11927
2022-10-07, A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis, Yichen Han et.al., Paper: http://arxiv.org/abs/2210.03335
2022-07-27, A Hybrid Deep Animation Codec for Low-bitrate Video Conferencing, Goluck Konuko et.al., Paper: http://arxiv.org/abs/2207.13530
2019-10-15, A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities, Deepali Aneja et.al., Paper: http://arxiv.org/abs/1909.08766, Code: https://github.com/danmcduff/AvatarSim
1998-12-05, A High Quality Text-To-Speech System Composed of Multiple Neural Networks, Orhan Karaali et.al., Paper: http://arxiv.org/abs/cs/9812006
2022-08-01, A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip, Shuang Chen et.al., Paper: http://arxiv.org/abs/2208.01149, Code: https://github.com/chrischen1023/nclg-mt
2024-06-18, A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing, Ming Meng et.al., Paper: http://arxiv.org/abs/2406.10553
2024-07-24, A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation, Jose Geraldo Fernandes et.al., Paper: http://arxiv.org/abs/2407.17430
2023-07-04, A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation, Louis Airale et.al., Paper: http://arxiv.org/abs/2307.03270, Code: https://github.com/louisbearing/hmo-audio
2024-03-11, A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos, Weixia Zhang et.al., Paper: http://arxiv.org/abs/2403.06421, Code: https://github.com/zwx8981/adth-qa
2023-04-06, 4D Agnostic Real-Time Facial Animation Pipeline for Desktop Scenarios, Wei Chen et.al., Paper: http://arxiv.org/abs/2304.02814
2023-12-01, 3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing, Balamurugan Thambiraja et.al., Paper: http://arxiv.org/abs/2312.00870
2024-09-17, 3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy, Xuanmeng Sha et.al., Paper: http://arxiv.org/abs/2409.10848
2021-04-25, 3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head, Qianyun Wang et.al., Paper: http://arxiv.org/abs/2104.12051
2023-11-05, 3D-Aware Talking-Head Video Motion Transfer, Haomiao Ni et.al., Paper: http://arxiv.org/abs/2311.02549
2019-08-29, 3D Face Pose and Animation Tracking via Eigen-Decomposition based Bayesian Approach, Ngoc-Trung Tran et.al., Paper: http://arxiv.org/abs/1908.11039
2020-08-29, "It took me almost 30 minutes to practice this". Performance and Production Practices in Dance Challenge Videos on TikTok, Daniel Klug et.al., Paper: http://arxiv.org/abs/2008.13040

(back to top)

Image Animation

2025-01-20, X-Dyna: Expressive Dynamic Human Image Animation, Di Chang et.al., Paper: http://arxiv.org/abs/2501.10021, Code: https://github.com/bytedance/x-dyna
2024-05-28, VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation, Qilin Wang et.al., Paper: http://arxiv.org/abs/2405.18156
2023-04-12, VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs, Moayed Haji Ali et.al., Paper: http://arxiv.org/abs/2304.06020
2015-03-16, Use of Effective Audio in E-learning Courseware, Kisor Ray et.al., Paper: http://arxiv.org/abs/1503.04837
2024-06-03, UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation, Xiang Wang et.al., Paper: http://arxiv.org/abs/2406.01188
2020-12-01, Ultra-low bitrate video conferencing using deep image animation, Goluck Konuko et.al., Paper: http://arxiv.org/abs/2012.00346
2010-01-04, Tutoring System for Dance Learning, Rajkumar Kannan et.al., Paper: http://arxiv.org/abs/1001.0440
2024-03-05, Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation, Weijie Li et.al., Paper: http://arxiv.org/abs/2403.02827
2022-03-29, Thin-Plate Spline Motion Model for Image Animation, Jian Zhao et.al., Paper: http://arxiv.org/abs/2203.14367, Code: https://github.com/yoyo-nb/thin-plate-spline-motion-model
2023-09-26, Text-Guided Synthesis of Eulerian Cinemagraphs, Aniruddha Mahapatra et.al., Paper: http://arxiv.org/abs/2307.03190, Code: https://github.com/text2cinemagraph/text2cinemagraph
2024-10-31, TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation, Sunjae Yoon et.al., Paper: http://arxiv.org/abs/2410.24037
2024-07-12, TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models, Jeongho Kim et.al., Paper: http://arxiv.org/abs/2407.09012
2024-11-27, StableAnimator: High-Quality Identity-Preserving Human Image Animation, Shuyuan Tu et.al., Paper: http://arxiv.org/abs/2411.17697, Code: https://github.com/Francis-Rings/StableAnimator
2021-09-03, Sparse to Dense Motion Transfer for Face Image Animation, Ruiqi Zhao et.al., Paper: http://arxiv.org/abs/2109.00471
2022-07-19, Single Stage Virtual Try-on via Deformable Attention Flows, Shuai Bai et.al., Paper: http://arxiv.org/abs/2207.09161, Code: https://github.com/OFA-Sys/DAFlow
2021-04-07, Single Source One Shot Reenactment using Weighted motion From Paired Feature Points, Soumya Tripathy et.al., Paper: http://arxiv.org/abs/2104.03117
2018-01-31, RAPTOR I: Time-dependent radiative transfer in arbitrary spacetimes, Thomas Bronzwaer et.al., Paper: http://arxiv.org/abs/1801.10452
2021-03-22, PriorityCut: Occlusion-guided Regularization for Warp-based Image Animation, Wai Ting Cheung et.al., Paper: http://arxiv.org/abs/2103.11600
2023-07-09, Predictive Coding For Animation-Based Video Compression, Goluck Konuko et.al., Paper: http://arxiv.org/abs/2307.04187
2025-01-09, Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation, Yingjie Chen et.al., Paper: http://arxiv.org/abs/2501.05020
2024-03-25, PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models, Yiming Zhang et.al., Paper: http://arxiv.org/abs/2312.13964, Code: https://github.com/open-mmlab/PIA
2022-04-05, Neural Fields in Visual Computing and Beyond, Yiheng Xie et.al., Paper: http://arxiv.org/abs/2111.11426
2022-11-30, NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation, Yu Yin et.al., Paper: http://arxiv.org/abs/2211.17235
2015-02-04, Multimedia-Video for Learning, Kah Hean Chua et.al., Paper: http://arxiv.org/abs/1502.01090
2021-12-19, Move As You Like: Image Animation in E-Commerce Scenario, Borun Xu et.al., Paper: http://arxiv.org/abs/2112.13647
2023-11-30, Motion-Conditioned Image Animation for Video Editing, Wilson Yan et.al., Paper: http://arxiv.org/abs/2311.18827
2022-09-28, Motion Transformer for Unsupervised Image Animation, Jiale Tao et.al., Paper: http://arxiv.org/abs/2209.14024, Code: https://github.com/jialetao/motrans
2024-12-20, MotiF: Making Text Count in Image Animation with Motion Focal Loss, Shijie Wang et.al., Paper: http://arxiv.org/abs/2412.16153
2024-01-03, Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions, David Junhao Zhang et.al., Paper: http://arxiv.org/abs/2401.01827, Code: https://github.com/salesforce/lavis
2013-01-25, Measurements of Martian Dust Devil Winds with HiRISE, David S. Choi et.al., Paper: http://arxiv.org/abs/1301.6130
2023-11-27, MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model, Zhongcong Xu et.al., Paper: http://arxiv.org/abs/2311.16498
2024-07-11, MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model, Muyao Niu et.al., Paper: http://arxiv.org/abs/2405.20222, Code: https://github.com/myniuuu/mofa-video
2023-12-05, LivePhoto: Real Image Animation with Text-guided Motion Control, Xi Chen et.al., Paper: http://arxiv.org/abs/2312.02928
2024-11-24, LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis, Haojie Zhang et.al., Paper: http://arxiv.org/abs/2411.16748
2022-03-17, Latent Image Animator: Learning to Animate Images via Latent Space Navigation, Yaohui Wang et.al., Paper: http://arxiv.org/abs/2203.09043
2023-10-11, LEO: Generative Latent Image Animator for Human Video Synthesis, Yaohui Wang et.al., Paper: http://arxiv.org/abs/2305.03989, Code: https://github.com/wyhsirius/LEO
2023-10-16, LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation, Ruiqi Wu et.al., Paper: http://arxiv.org/abs/2310.10769, Code: https://github.com/RQ-Wu/LAMP
2024-11-28, JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation, Xuyang Cao et.al., Paper: http://arxiv.org/abs/2411.09209, Code: https://github.com/jdh-algo/JoyVASA
2022-07-08, Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation, Yucheng Suo et.al., Paper: http://arxiv.org/abs/2207.03714
2025-01-15, Joint Learning of Depth and Appearance for Portrait Image Animation, Xinya Ji et.al., Paper: http://arxiv.org/abs/2501.08649
2021-10-26, Incremental Learning for Animal Pose Estimation using RBF k-DPP, Gaurav Kumar Nayak et.al., Paper: http://arxiv.org/abs/2110.13598
2022-10-04, Implicit Warping for Animation with Image Sets, Arun Mallya et.al., Paper: http://arxiv.org/abs/2210.01794
2022-03-29, Image Animation with Perturbed Masks, Yoav Shalev et.al., Paper: http://arxiv.org/abs/2011.06922, Code: https://github.com/itsyoavshalev/Image-Animation-with-Perturbed-Masks
2021-12-21, Image Animation with Keypoint Mask, Or Toledano et.al., Paper: http://arxiv.org/abs/2112.10457, Code: https://github.com/or-toledano/animation-with-keypoint-mask
2024-09-30, Illustrious: an Open Advanced Illustration Model, Sang Hyun Park et.al., Paper: http://arxiv.org/abs/2409.19946
2024-11-21, HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation, Zhenzhi Wang et.al., Paper: http://arxiv.org/abs/2407.17438, Code: https://github.com/zhenzhiwang/humanvid
2024-09-29, High Quality Human Image Animation using Regional Supervision and Motion Blur Condition, Zhongcong Xu et.al., Paper: http://arxiv.org/abs/2409.19580
2024-06-16, Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation, Mingwang Xu et.al., Paper: http://arxiv.org/abs/2406.08801
2025-01-04, Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks, Jiahao Cui et.al., Paper: http://arxiv.org/abs/2412.00733, Code: https://github.com/fudan-generative-vision/hallo3
2024-10-14, Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation, Jiahao Cui et.al., Paper: http://arxiv.org/abs/2410.07718, Code: https://github.com/fudan-generative-vision/hallo2
2016-06-23, Gender and Interest Targeting for Sponsored Post Advertising at Tumblr, Mihajlo Grbovic et.al., Paper: http://arxiv.org/abs/1606.07189
2024-10-20, FrameBridge: Improving Image-to-Video Generation with Bridge Models, Yuji Wang et.al., Paper: http://arxiv.org/abs/2410.15371
2024-06-13, Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control, Jingyun Xue et.al., Paper: http://arxiv.org/abs/2406.03035
2024-03-13, Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts, Yue Ma et.al., Paper: http://arxiv.org/abs/2403.08268, Code: https://github.com/mayuelala/followyourclick
2020-10-01, First Order Motion Model for Image Animation, Aliaksandr Siarohin et.al., Paper: http://arxiv.org/abs/2003.00196, Code: https://github.com/AliaksandrSiarohin/first-order-model
2024-12-04, FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait, Taekyung Ki et.al., Paper: http://arxiv.org/abs/2412.01064
2024-05-29, Evaluating the efectiveness of sonifcation in science education using Edukoi, Lucrezia Guiotto Nai Fovino et.al., Paper: http://arxiv.org/abs/2405.18908
2024-07-12, EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions, Zhiyuan Chen et.al., Paper: http://arxiv.org/abs/2407.08136
2023-11-27, DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors, Jinbo Xing et.al., Paper: http://arxiv.org/abs/2310.12190, Code: https://github.com/Doubiiu/DynamiCrafter
2023-02-02, Dreamix: Video Diffusion Models are General Video Editors, Eyal Molad et.al., Paper: http://arxiv.org/abs/2302.01329
2024-11-30, DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses, Yatian Pang et.al., Paper: http://arxiv.org/abs/2412.00397
2024-09-22, Dormant: Defending against Pose-driven Human Image Animation, Jiachen Zhou et.al., Paper: http://arxiv.org/abs/2409.14424, Code: https://github.com/Manu21JC/Dormant
2024-12-13, DisPose: Disentangling Pose Guidance for Controllable Human Image Animation, Hongxiang Li et.al., Paper: http://arxiv.org/abs/2412.09349, Code: https://github.com/lihxxx/dispose
2023-11-19, Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation, Peirong Liu et.al., Paper: http://arxiv.org/abs/2110.04658
2021-08-18, DeepFake MNIST+: A DeepFake Facial Animation Dataset, Jiajun Huang et.al., Paper: http://arxiv.org/abs/2108.07949, Code: https://github.com/huangjiadidi/DeepFakeMnist
2020-08-27, Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation, Yurui Ren et.al., Paper: http://arxiv.org/abs/2008.12606, Code: https://github.com/RenYurui/Global-Flow-Local-Attention
2024-05-28, Controllable Longer Image Animation with Diffusion Models, Qiang Wang et.al., Paper: http://arxiv.org/abs/2405.17306
2023-01-14, Continuous odor profile monitoring to study olfactory navigation in small animals, Kevin S. Chen et.al., Paper: http://arxiv.org/abs/2301.05905
2024-01-17, Continuous Piecewise-Affine Based Motion Model for Image Animation, Hexiang Wang et.al., Paper: http://arxiv.org/abs/2401.09146, Code: https://github.com/devilpg/aaai2024-cpabmm
2024-07-23, Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models, Xin Ma et.al., Paper: http://arxiv.org/abs/2407.15642, Code: https://github.com/maxin-cn/Cinemo
2024-06-01, Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance, Shenhao Zhu et.al., Paper: http://arxiv.org/abs/2403.14781, Code: https://github.com/fudan-generative-vision/champ
2022-06-11, Bayesian Statistics Guided Label Refurbishment Mechanism: Mitigating Label Noise in Medical Image Classification, Mengdi Gao et.al., Paper: http://arxiv.org/abs/2106.12284, Code: https://github.com/neugmd/blrm
2023-09-25, Automatic Animation of Hair Blowing in Still Portrait Photos, Wenpeng Xiao et.al., Paper: http://arxiv.org/abs/2309.14207
2024-03-08, Audio-Synchronized Visual Animation, Lin Zhang et.al., Paper: http://arxiv.org/abs/2403.05659
2019-08-30, Animating Arbitrary Objects via Deep Motion Transfer, Aliaksandr Siarohin et.al., Paper: http://arxiv.org/abs/1812.08861, Code: https://github.com/AliaksandrSiarohin/monkey-net
2023-12-06, AnimateZero: Video Diffusion Models are Zero-Shot Image Animators, Jiwen Yu et.al., Paper: http://arxiv.org/abs/2312.03793, Code: https://github.com/vvictoryuki/animatezero
2023-12-04, AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance, Zuozhuo Dai et.al., Paper: http://arxiv.org/abs/2311.12886, Code: https://github.com/alibaba/animate-anything
2024-12-11, Animate-X: Universal Character Image Animation with Enhanced Motion Representation, Shuai Tan et.al., Paper: http://arxiv.org/abs/2410.10306
2021-06-23, Analisis Kualitas Layanan Website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0, Adellia et.al., Paper: http://arxiv.org/abs/2106.15342
2021-12-17, AI-Empowered Persuasive Video Generation: A Survey, Chang Liu et.al., Paper: http://arxiv.org/abs/2112.09401
2018-06-24, A Design of FPGA Based Small Animal PET Real Time Digital Signal Processing and Correction Logic, Jiaming Lu et.al., Paper: http://arxiv.org/abs/1806.09117
2018-10-09, 3D model silhouette-based tracking in depth images for puppet suit dynamic video-mapping, Guillaume Caron et.al., Paper: http://arxiv.org/abs/1810.03956
2022-03-25, 3D GAN Inversion for Controllable Portrait Image Animation, Connor Z. Lin et.al., Paper: http://arxiv.org/abs/2203.13441
2023-03-10, 3D Cinemagraphy from a Single Image, Xingyi Li et.al., Paper: http://arxiv.org/abs/2303.05724

(back to top)

Notes:

We have modified the sorting rule of the above table to prioritize papers based on the time of their latest update rather than their initial publication date. If an article has been recently modified, it will appear earlier in the list.
However, recent trends are still based on ten papers sorted by the initial publication date.

Function added:

Support more reliable text parser. Link
Support rich markdown format (better at parsing experimental tables). Link
Supports the analysis of more than 10 papers in a single conversation, which exceeds the attachment size limit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wechat.md

wechat.md

Talking-Face Research Papers (With GPT Analysis)

Talking Face

Image Animation

Files

wechat.md

Latest commit

History

wechat.md

File metadata and controls

Talking-Face Research Papers (With GPT Analysis)

Talking Face

Image Animation