Skip to content

Physical laws underpin all existence, and harnessing them for generative modeling opens boundless possibilities for advancing science and shaping the future!

Notifications You must be signed in to change notification settings

BestJunYu/Awesome-Physics-aware-Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 

Repository files navigation

Awesome-Physics-aware-Generation

😄😄 Under Construction 😄😄


Physics-Aware Generation

  1. Visual Grounding of Learned Physical Models. Yunzhu Li, Toru Lin, Kexin Yi, Daniel M. Bear, Daniel L. K. Yamins, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba, International Conference on Machine Learning (ICML), 2020. GitHub

  2. GASP: Gaussian Splatting for Physic-Based Simulations. Piotr Borycki, Weronika Smolak, Joanna Waczyńska, Marcin Mazur, Sławomir Tadeja, Przemysław Spurek, arXiv, 2024. Paper

  3. Unsupervised Learning for Physical Interaction through Video Prediction. Chelsea Finn, Ian Goodfellow Openai, Sergey Levine, Google Brain, Advances in Neural Information Processing Systems (NeurIPS), 2016. Paper

  4. VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality. Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, Chenfanfu Jiang, Proceedings - SIGGRAPH 2024 Conference Papers, 2024. Paper

  5. DoughNet: A Visual Predictive Model for Topological Manipulation of Deformable Objects. Dominik Bauer, Zhenjia Xu, Shuran Song, European Conference on Computer Vision (ECCV), 2024. GitHub

  6. Physically Compatible 3D Object Modeling from a Single Image. Minghao Guo, Bohan Wang, Pingchuan Ma, Tianyuan Zhang, Crystal Elaine Owens, Chuang Gan, Joshua B. Tenenbaum, Kaiming He, Wojciech Matusik, arXiv, 2024. Paper

  7. Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing. Ri-Zhao Qiu, Ge Yang, Weijia Zeng, Xiaolong Wang, arXiv, 2024. Paper

  8. Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination. Leonardo Barcellona, Andrii Zadaianchuk, Davide Allegro, Samuele Papa, Stefano Ghidoni, Efstratios Gavves, arXiv, 2024. Paper

  9. Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models. Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Paper

  10. LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field. Jiyang Li, Lechao Cheng, Zhangye Wang, Tingting Mu, Jingxuan He, arXiv, 2024. Paper

  11. Disco4D: Disentangled 4D Human Generation and Animation from a Single Image. Hui En Pang, Shuai Liu, Zhongang Cai, Lei Yang, Tianwei Zhang, Ziwei Liu, arXiv, 2024. Paper

  12. Compositional 3D-aware Video Generation with LLM Director. Hanxin Zhu, Tianyu He, Anni Tang, Junliang Guo, Zhibo Chen, Jiang Bian, arXiv, 2024. Paper

  13. Improving Physics-Augmented Continuum Neural Radiance Field-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization. Takuhiro Kaneko, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Paper

  14. Generative Image Dynamics. Zhengqi Li, Richard Tucker, Noah Snavely, Aleksander Holynski, arXiv, 2023. Paper

  15. LivePhoto: Real Image Animation with Text-guided Motion Control. Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao, arXiv, 2023. Paper

  16. VideoComposer: Compositional Video Synthesis with Motion Controllability. Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, arXiv, 2023. Paper

  17. PAC-NeRF: Physics Augmented Continuum Neural Radiance Fields for Geometry-Agnostic System Identification. Xuan Li, Yi-Ling Qiao, Peter Yichen Chen, Krishna Murthy Jatavallabhula, Ming Lin, Chenfanfu Jiang, Chuang Gan, International Conference on Learning Representations (ICLR), 2023. Paper

  18. SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction. Xinyuan Chen, Yaohui Wang, Lingjun Zhang, Shaobin Zhuang, Xin Ma, Jiashuo Yu, Yali Wang, Dahua Lin, Yu Qiao, Ziwei Liu, arXiv, 2023. GitHub

  19. PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics. Tianyi Xie, Zeshun Zong, Yuxing Qiu, Xuan Li, Yutao Feng, Yin Yang, Chenfanfu Jiang, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. GitHub

  20. InterDyn: Controllable Interactive Dynamics with Video Diffusion Models. Rick Akkerman, Haiwen Feng, Michael J. Black, Dimitrios Tzionas, Victoria Fernández Abrevaya, arXiv, 2024. Paper

  21. Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators. Daniel Geng, Andrew Owens, International Conference on Learning Representations (ICLR), 2024. GitHub

  22. VideoPoet: A Large Language Model for Zero-Shot Video Generation. Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam, Ming-Hsuan Yang, Irfan Essa, Huisheng Wang, David A. Ross, Bryan Seybold, Lu Jiang, International Conference on Machine Learning (ICML), 2024. Paper

  23. Implicit Warping for Animation with Image Sets. Arun Mallya, Ting-Chun Wang, Ming-Yu Liu, Advances in Neural Information Processing Systems (NeurIPS), 2022. Paper

  24. Understanding Object Dynamics for Interactive Image-to-Video Synthesis. Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. Paper

  25. Thin-Plate Spline Motion Model for Image Animation. Jian Zhao, Hui Zhang, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Paper

  26. Controllable Animation of Fluid Elements in Still Images. Aniruddha Mahapatra, Kuldeep Kulkarni, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Paper

  27. Animating Pictures with Eulerian Motion Fields. Aleksander Holynski, Brian Curless, Steven M Seitz, Richard Szeliski, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. Paper

  28. 3D-VLA: A 3D Vision-Language-Action Generative World Model. Haoyu Zhen, Xiaowen Qiu, Peihao Chen, Jincheng Yang, Xin Yan, Yilun Du, Yining Hong, Chuang Gan, arXiv, 2024. GitHub

  29. AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers. Sherwin Bahmani, Ivan Skorokhodov, Guocheng Qian, Aliaksandr Siarohin, Willi Menapace, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov, arXiv, 2024. Homepage

  30. Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication. Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang, arXiv, 2024. Paper

  31. Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting. Haoyu Zhao, Hao Wang, Xingyue Zhao, Hongqiu Wang, Zhiyu Wu, Chengjiang Long, Hua Zou, arXiv, 2024. Paper

  32. AutoVFX: Physically Realistic Video Editing from Natural Language Instructions. Hao-Yu Hsu, Zhi-Hao Lin, Albert Zhai, Hongchi Xia, Shenlong Wang, arXiv, 2024. Homepage

  33. Controllable Video Generation Through Global and Local Motion Dynamics. Aram Davtyan, Paolo Favaro, European Conference on Computer Vision, 2022. Paper

  34. DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models. Tsun-Hsuan Wang, Juntian Zheng, Pingchuan Ma, Yilun Du, Byungchul Kim, Andrew Spielberg, Joshua Tenenbaum, Chuang Gan, Daniela Rus, Advances in Neural Information Processing Systems (NeurIPS), 2023. Homepage

  35. DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors. Tianyu Huang, Haoze Zhang, Yihan Zeng, Zhilu Zhang, Hui Li, Wangmeng Zuo, Rynson W. H. Lau, arXiv, 2024. GitHub

  36. Generating 3D-Consistent Videos from Unposed Internet Photos. Gene Chou, Kai Zhang, Sai Bi, Hao Tan, Zexiang Xu, Fujun Luan, Bharath Hariharan, Noah Snavely, arXiv, 2024. Paper

  37. Generative Omnimatte: Learning to Decompose Video into Layers. Yao-Chih Lee, Erika Lu, Sarah Rumbley, Michal Geyer, Jia-Bin Huang, Tali Dekel, Forrester Cole, arXiv, 2024. Homepage

  38. GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning. Jiaxi Lv, Yi Huang, Mingfu Yan, Jiancheng Huang, Jianzhuang Liu, Yifan Liu, Yafei Wen, Xiaoxin Chen, Shifeng Chen, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. GitHub

  39. GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation. Chi-Lam Cheang, Guangzeng Chen, Ya Jing, Tao Kong, Hang Li, Yifeng Li, Yuxiao Liu, Hongtao Wu, Jiafeng Xu, Yichu Yang, Hanbo Zhang, Minzhao Zhu, arXiv, 2024. Homepage

  40. Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions. Zhenyu Jiang, Yuqi Xie, Jinhan Li, Ye Yuan, Yifeng Zhu, Yuke Zhu, arXiv, 2024. Paper

  41. HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness. Zihui Xue, Mi Luo, Changan Chen, Kristen Grauman, arXiv, 2024. GitHub

  42. Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback. Hiroki Furuta, Heiga Zen, Dale Schuurmans, Aleksandra Faust, Yutaka Matsuo, Percy Liang, Sherry Yang, arXiv, 2024. Paper

  43. Layered Controllable Video Generation. Jiahui Huang, Yuhe Jin, Kwang Moo Yi, Leonid SIgal, European Conference on Computer Vision (ECCV), 2022. Homepage

  44. Learn the Force We Can: Enabling Sparse Motion Control in Multi-Object Video Generation. Aram Davtyan, Paolo Favaro, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2024. GitHub

  45. LIVE-GS: LLM Powers Interactive VR by Enhancing Gaussian Splatting. Haotian Mao, Zhuoxiong Xu, Siyue Wei, Yule Quan, Nianchen Deng, Xubo Yang, arXiv, 2024. Paper

  46. LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models. Anoop Cherian, Radu Corcodel, Siddarth Jain, Diego Romeres, arXiv, 2024. Paper

  47. MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Homepage

  48. Motion Prompting: Controlling Video Generation with Motion Trajectories. Daniel Geng, Charles Herrmann, Junhwa Hur, Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun, arXiv, 2024. Homepage

  49. Motion-Conditioned Diffusion Model for Controllable Video Synthesis. Tsai-Shien Chen, Chieh Hubert Lin, Hung-Yu Tseng, Tsung-Yi Lin, Ming-Hsuan Yang, arXiv, 2023. Paper

  50. MotionCraft: Physics-based Zero-Shot Video Generation. Luca Savant Aira, Antonio Montanaro, Emanuele Aiello, Diego Valsesia, Enrico Magli, arXiv, 2024. GitHub

  51. PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction. Hao Wu, Wei Xiong, Fan Xu, Xiao Luo, Chong Chen, Xian-Sheng Hua, Haixin Wang, Proceedings of the 32nd ACM International Conference on Multimedia, 2024. Paper

  52. Phy124: Fast Physics-Driven 4D Content Generation from a Single Image. Jiajing Lin, Zhenzhong Wang, Yongjie Hou, Yuzhou Tang, Min Jiang, arXiv, 2024. Paper

  53. PhyCAGE: Physically Plausible Compositional 3D Asset Generation from a Single Image. Han Yan, Mingrui Zhang, Yang Li, Chao Ma, Pan Ji, arXiv, 2024. Paper

  54. PhyRecon: Physically Plausible Neural Scene Reconstruction. Junfeng Ni, Yixin Chen, Bohan Jing, Nan Jiang, Bin Wang, Bo Dai, Puhao Li, Yixin Zhu, Song-Chun Zhu, Siyuan Huang, arXiv, 2024. GitHub

  55. Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image. Jiajing Lin, Zhenzhong Wang, Shu Jiang, Yongjie Hou, Min Jiang, arXiv, 2024. Homepage

  56. PhysDiff: Physics-Guided Human Motion Diffusion Model. Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023. Paper

  57. PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation. Tianyuan Zhang, Hong-Xing Yu, Rundi Wu, Brandon Y. Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman, European Conference on Computer Vision (ECCV), 2024. GitHub

  58. PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation. Shaowei Liu, Zhongzheng Ren, Saurabh Gupta, Shenlong Wang, European Conference on Computer Vision (ECCV), 2024. GitHub

  59. Physics informed neural fields for smoke reconstruction with sparse data. Mengyu Chu, Lingjie Liu, Quan Zheng, Erik Franz, Hans Peter Seidel, Christian Theobalt, Rhaleb Zayer, ACM Transactions on Graphics, 2022. Paper

  60. Physics-based Human Motion Estimation and Synthesis from Videos. Kevin Xie, Tingwu Wang, Umar Iqbal, Yunrong Guo, Sanja Fidler, Florian Shkurti, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021. Paper

  61. Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos. Kun Su, Kaizhi Qian, Eli Shlizerman, Antonio Torralba, Chuang Gan, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. Homepage

  62. Physics-Guided Human Motion Capture with Pose Probability Modeling. Jingyi Ju, Buzhen Huang, Chen Zhu, Zhihao Li, Yangang Wang, arXiv, 2023. GitHub

  63. PhysMotion: Physics-Grounded Dynamics From a Single Image. Xiyang Tan, Ying Jiang, Xuan Li, Zeshun Zong, Tianyi Xie, Yin Yang, Chenfanfu Jiang, arXiv, 2024. Homepage

  64. Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion. Fangfu Liu, Hanyang Wang, Shunyu Yao, Shengjun Zhang, Jie Zhou, Yueqi Duan, arXiv, 2024. Homepage

  65. PID: Physics-Informed Diffusion Model for Infrared Image Generation. Fangyuan Mao, Jilin Mei, Shun Lu, Fuyang Liu, Liang Chen, Fangzhou Zhao, Yu Hu, arXiv, 2024. GitHub

  66. Procedural Generation of Videos to Train Deep Action Recognition Networks. Cesar Roberto deSouza, Adrien Gaidon, Yohann Cabon, Antonio Manuel Lopez Pena, arXiv, 2016. Paper

  67. Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis. Yuanhao Cai, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille, arXiv, 2024. GitHub

  68. RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation. Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, Katerina Fragkiadaki, Zackory Erickson, David Held, Chuang Gan, arXiv, 2023. GitHub

  69. Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport. Unknown, International Conference on Learning Representations (ICLR), 2025. GitHub

  70. SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models. Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, Bo Dai, arXiv, 2023. Paper

  71. StableV2V: Stablizing Shape Consistency in Video-to-Video Editing. Chang Liu, Rui Li, Kaidong Zhang, Yunwei Lan, Dong Liu, arXiv, 2024. Paper

  72. StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration. Panwen Hu, Jin Jiang, Jianqi Chen, Mingfei Han, Shengcai Liao, Xiaojun Chang, Xiaodan Liang, arXiv, 2024. Paper

  73. Synthetic Vision: Training Vision-Language Models to Understand Physics. Vahid Balazadeh, Mohammadmehdi Ataei, Hyunmin Cheong, Amir Hosein Khasahmadi, Rahul G. Krishnan, arXiv, 2024. Paper

  74. Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation. Fanqing Meng, Jiaqi Liao, Xinyu Tan, Wenqi Shao, Quanfeng Lu, Kaipeng Zhang, Yu Cheng, Dianqi Li, Yu Qiao, Ping Luo, arXiv, 2024. GitHub

  75. Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video. Erik Gartner, Mykhaylo Andriluka, Hongyi Xu, Cristian Sminchisescu, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Homepage

  76. VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control. Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, Willi Menapace, Guocheng Qian, Michael Vasilkovsky, Hsin-Ying Lee, Chaoyang Wang, Jiaxu Zou, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov, arXiv, 2024. Homepage

  77. Video Creation by Demonstration. Yihong Sun, Hao Zhou, Liangzhe Yuan, Jennifer J. Sun, Yandong Li, Xuhui Jia, Hartwig Adam, Bharath Hariharan, Long Zhao, Ting Liu, arXiv, 2024. GitHub

  78. Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video. Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, Shenlong Wang, arXiv, 2024. Homepage

  79. VividDream: Generating 3D Scene with Ambient Dynamics. Yao-Chih Lee, Yi-Ting Chen, Andrew Wang, Ting-Hsuan Liao, Brandon Y. Feng, Jia-Bin Huang, arXiv, 2024. Paper

  80. Physically-aware Generative Network for 3D Shape Modeling. Mariem Mezghanni, Malika Boulkenafed, André Lieutier, Maks Ovsjanikov, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. Paper

  81. Physically-guided Disentangled Implicit Rendering for 3D Face Modeling. Zhenyu Zhang, Yanhao Ge, Ying Tai, Weijian Cao, Renwang Chen, Kunlin Liu, Hao Tang, Xiaoming Huang, Chengjie Wang, Zhifeng Xie, Dongjin Huang, Tencent Youtu Lab, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Paper

  82. Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation. Xueyi Liu, Bin Wang, He Wang, Yi Li, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023. GitHub

  83. CoCoGen: Physically-Consistent and Conditioned Score-based Generative Models for Forward and Inverse Problems. Christian Jacobsen, Yilin Zhuang, Karthik Duraisamy, arXiv, 2023. Paper

  84. LLplace: The 3D Indoor Scene Layout Generation and Editing via Large Language Model. Yixuan Yang, Junru Lu, Zixiang Zhao, Zhen Luo, James J. Q. Yu, Victor Sanchez, Feng Zheng, arXiv, 2024. Paper

  85. Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop. Zhaofang Qian, Abolfazl Sharifi, Tucker Carroll, Ser-Nam Lim, arXiv, 2024. Homepage

  86. M2Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes. Sixu Yan, Zeyu Zhang, Muzhi Han, Zaijin Wang, Qi Xie, Zhitian Li, Zhehan Li, Hangxin Liu, Xinggang Wang, Song-Chun Zhu, arXiv, 2024. Paper

  87. Motion Dreamer: Realizing Physically Coherent Video Generation through Scene-Aware Motion Reasoning. Tianshuo Xu, Zhifei Chen, Leyi Wu, Hao Lu, Yuying Chen, Lihui Jiang, Bingbing Liu, Yingcong Chen, arXiv, 2024. Paper

  88. PhysPart: Physically Plausible Part Completion for Interactable Objects. Rundong Luo, Haoran Geng, Congyue Deng, Puhao Li, Zan Wang, Baoxiong Jia, Leonidas Guibas, Siyuan Huang, arXiv, 2024. Paper

  89. SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects. Jiayi Liu, Denys Iliash, Angel X. Chang, Manolis Savva, Ali Mahdavi-Amiri, arXiv, 2024. Paper

  90. PHYSCENE: Physically Interactable 3D Scene Synthesis for Embodied AI. Yandan Yang, Baoxiong Jia, Peiyuan Zhi, Siyuan Huang, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Paper

  91. Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering. Kim Youwang, Tae-Hyun Oh, Gerard Pons-Moll, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. GitHub

  92. PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation. Yunze Liu, Changxi Chen, Chenjing Ding, Li Yi, arXiv, 2024. Paper

  93. Towards Physically Stable Motion Generation: A New Paradigm of Human Pose Representation. Qiongjie Cui, Zhenyu Lou, Zhenbo Song, Xiangbo Shu, IEEE Transactions on Circuits and Systems for Video Technology, 2024. Paper

  94. Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs. Aayam Shrestha, Pan Liu, German Ros, Kai Yuan, Alan Fern, European Conference on Computer Vision (ECCV), 2024. Homepage

  95. MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators. Shenghai Yuan*, Jinfa Huang*, Yujun Shi, Yongqi Xu, Ruijie Zhu, Bin Lin, Xinhua Cheng, Li Yuan, Jiebo Luo, arXiv, 2024. Homepage Paper Code

Physics Engine/ Simulation Platforms

  1. Genesis: A Generative and Universal Physics Engine for Robotics and Beyond. Genesis Authors, arXiv, 2024. Homepage

  2. Pymunk. Pymunk Authors, arXiv, 2024. Website

  3. Taichi: A language for high-performance computation on spatially sparse data structures. Yuanming Hu, Tzu Mao Li, Luke Anderson, Jonathan Ragan-Kelley, Frédo Durand, ACM Transactions on Graphics, 2019. Paper

  4. DiffTaichi: Differentiable Programming for Physical Simulation. Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, Frédo Durand, arXiv, 2019. Paper

  5. MuJoCo: A physics engine for model-based control. Emanuel Todorov, Tom Erez, Yuval Tassa, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012. Paper

  6. FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation. Zhou Xian, Bo Zhu, Zhenjia Xu, Hsiao-Yu Tung, Antonio Torralba, Katerina Fragkiadaki, Chuang Gan, arXiv, 2023. Paper

  7. SAPIEN: A SimulAted Part-based Interactive ENvironment. Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020. Paper

  8. ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation. Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund, David Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh H. McDermott, Daniel L. K. Yamins, arXiv, 2020. Paper

  9. UBSoft: A Simulation Platform for Robotic Skill Learning in Unbounded Soft Environments. Chunru Lin, Jugang Fan, Yian Wang, Zeyuan Yang, Zhehuan Chen, Lixing Fang, Tsun-Hsuan Wang, Zhou Xian, Chuang Gan, arXiv, 2024. Paper

  10. Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning. Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, Gavriel State, arXiv, 2021. Paper

  11. PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots. Hangxin Liu, Qi Xie, Zeyu Zhang, Tao Yuan, Xiaokun Leng, Lining Sun, Song-Chun Zhu, Jingwen Zhang, Zhicheng He, Yao Su, arXiv, 2024. Paper

  12. PyBullet. PyBullet Authors, arXiv, 2024. Website

  13. Nvidia PhysX. Nvidia PhysX Authors, arXiv, 2024. GitHub

  14. Open Dynamics Engine. Russ Smith, arXiv, 2024. Website

  15. Chrono: An open source multi-physics dynamics engine. Alessandro Tasora, Radu Serban, Hammad Mazhar, Arman Pazouki, Daniel Melanz, Jonathan Fleischmann, Michael Taylor, Hiroyuki Sugiyama, Dan Negrut, High Performance Computing in Science and Engineering, 2015. Paper

  16. Unity: A General Platform for Intelligent Agents. Arthur Juliani, Vincent-Pierre Berges, Ervin Teng, Andrew Cohen, Jonathan Harper, Chris Elion, Chris Goy, Yuan Gao, Hunter Henry, Marwan Mattar, Danny Lange, arXiv, 2018. Paper

  17. Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation. C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, Olivier Bachem, arXiv, 2021. Paper

  18. Design and use paradigms for gazebo, an open-source multi-robot simulator. N. Koenig, A. Howard, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 2004. Paper

  19. WebotsTM: Professional Mobile Robot Simulation. Olivier Michel, arXiv, 2004. Paper

  20. XPBD: Position-based simulation of compliant constrained dynamics. Miles Macklin, Matthias Müller, Nuttapong Chentanez, Proceedings - Motion in Games 2016: 9th International Conference on Motion in Games, MIG 2016, 2016. Paper

Physics Simulation

  1. Genesis: A Generative and Universal Physics Engine for Robotics and Beyond. Genesis Authors, arXiv, 2024. Homepage

  2. Pymunk. Pymunk Authors, arXiv, 2024. Website

  3. Taichi: A language for high-performance computation on spatially sparse data structures. Yuanming Hu, Tzu Mao Li, Luke Anderson, Jonathan Ragan-Kelley, Frédo Durand, ACM Transactions on Graphics, 2019. Paper

  4. DiffTaichi: Differentiable Programming for Physical Simulation. Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, Frédo Durand, arXiv, 2019. Paper

  5. ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation. Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund, David Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh H. McDermott, Daniel L. K. Yamins, arXiv, 2020. Paper

  6. UBSoft: A Simulation Platform for Robotic Skill Learning in Unbounded Soft Environments. Chunru Lin, Jugang Fan, Yian Wang, Zeyuan Yang, Zhehuan Chen, Lixing Fang, Tsun-Hsuan Wang, Zhou Xian, Chuang Gan, arXiv, 2024. Paper

  7. Efficient Generation of Multimodal Fluid Simulation Data. Daniele Baieri, Donato Crisostomi, Stefano Esposito, Filippo Maggioli, Emanuele Rodolà, arXiv, 2023.Paper

  8. Learning to Simulate Complex Physics with Graph Networks. Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, Peter W. Battaglia, International Conference on Machine Learning (ICML), 2020. Paper

  9. Complex Locomotion Skill Learning via Differentiable Physics. Yu Fang, Jiancheng Liu, Mingrui Zhang, Jiasheng Zhang, Yidong Ma, Minchen Li, Yuanming Hu, Chenfanfu Jiang, Tiantian Liu, arXiv, 2022. Paper

  10. Differentiable Simulation of Soft Multi-body Systems. Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin, Advances in Neural Information Processing Systems (NeurIPS), 2021. GitHub

  11. DiffPD: Differentiable Projective Dynamics. Tao Du, Kui Wu, Pingchuan Ma, Sebastien Wah, Andrew Spielberg, Daniela Rus, Wojciech Matusik, ACM Transactions on Graphics, 2022. Paper

  12. PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics. Zhiao Huang, Yuanming Hu, Tao Du, Siyuan Zhou, Hao Su, Joshua B. Tenenbaum, Chuang Gan, International Conference on Learning Representations (ICLR), 2021. Paper

  13. Graph networks as learnable physics engines for inference and control. Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, Peter Battaglia, International Conference on Machine Learning (ICML), 2018. Paper

  14. Differentiable Simulation of Soft Multi-body Systems. Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin, Advances in Neural Information Processing Systems (NeurIPS), 2021. Paper

  15. DIFFTACTILE: A Physics-based Differentiable Tactile Simulator for Contact-rich Robotic Manipulation. Zilin Si, Gu Zhang, Qingwei Ben, Branden Romero, Zhou Xian, Chao Liu, Chuang Gan, arXiv, 2024. Paper

  16. Efficient Differentiable Simulation of Articulated Bodies. Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin, International Conference on Machine Learning (ICML), 2021. Paper

  17. Interpretable Intuitive Physics Model. Tian Ye, Xiaolong Wang, James Davidson, Abhinav Gupta, Proceedings of the European Conference on Computer Vision (ECCV), 2018. Paper

  18. Learning to Identify Physical Parameters from Video Using Differentiable Physics. Rama Krishna Kandukuri, Jan Achterhold, Michael Möller, Jörg Stückler, arXiv, 2020. Paper

  19. Scalable Differentiable Physics for Learning and Control. Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin, arXiv, 2020. Paper

  20. InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction. Pengzhen Ren, Min Li, Zhen Luo, Xinshuai Song, Ziwei Chen, Weijia Liufu, Yixuan Yang, Hao Zheng, Rongtao Xu, Zitong Huang, Tongsheng Ding, Luyang Xie, Kaidong Zhang, Changfei Fu, Yang Liu, Liang Lin, Feng Zheng, Xiaodan Liang, arXiv, 2024. Paper

  21. DiffXPBD : Differentiable Position-Based Simulation of Compliant Constraint Dynamics. Tuur Stuyck, Hsiao-yu Chen, arXiv, 2023. Paper

  22. Unified simulation of elastic rods, shells, and solids. Sebastian Martin, Peter Kaufmann, Mario Botsch, Eitan Grinspun, Markus Gross, ACM SIGGRAPH 2010 Papers, SIGGRAPH 2010, 2010. Paper

Physics Understanding (from Videos/Observations)

  1. Learning to Simulate Complex Physics with Graph Networks. Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, Peter W. Battaglia, International Conference on Machine Learning (ICML), 2020. Paper

  2. Graph networks as learnable physics engines for inference and control. Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, Peter Battaglia, International Conference on Machine Learning (ICML), 2018. Paper

  3. Interpretable Intuitive Physics Model. Tian Ye, Xiaolong Wang, James Davidson, Abhinav Gupta, Proceedings of the European Conference on Computer Vision (ECCV), 2018. Paper

  4. Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians. Licheng Zhong, Hong-Xing Yu, Jiajun Wu, Yunzhu Li, European Conference on Computer Vision (ECCV), 2024. GitHub

  5. GIC: Gaussian-Informed Continuum for Physical Property Identification and Simulation. Junhao Cai, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen, Advances in Neural Information Processing Systems (NeurIPS), 2024. GitHub

  6. Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photorealistic Appearance from Multi-View Video. Boxiang Rong, Artur Grigorev, Wenbo Wang, Michael J. Black, Bernhard Thomaszewski, Christina Tsalicoglou, Otmar Hilliges, arXiv, 2024. Paper

  7. PIE-NeRF: Physics-based Interactive Elastodynamics with NeRF. Yutao Feng, Yintong Shang, Xuan Li, Tianjia Shao, Chenfanfu Jiang, Yin Yang, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. GitHub

  8. ElastoGen: 4D Generative Elastodynamics. Yutao Feng, Yintong Shang, Xiang Feng, Lei Lan, Shandian Zhe, Tianjia Shao, Hongzhi Wu, Kun Zhou, Hao Su, Chenfanfu Jiang, Yin Yang, arXiv, 2024. GitHub

  9. NVFi: Neural Velocity Fields for 3D Physics Learning from Dynamic Videos. Jinxi Li, Ziyang Song, Bo Yang, Advances in Neural Information Processing Systems (NeurIPS), 2023. Paper

  10. Inferring Hybrid Neural Fluid Fields from Videos. Hong-Xing Yu, Yang Zheng, Yuan Gao, Yitong Deng, Bo Zhu, Jiajun Wu, Advances in Neural Information Processing Systems (NeurIPS), 2023. Paper

  11. NeuroFluid: Fluid Dynamics Grounding with Particle-Driven Neural Radiance Fields. Shanyan Guan, Huayu Deng, Yunbo Wang, Xiaokang Yang, International Conference on Machine Learning (ICML), 2022. Paper

  12. Virtual Elastic Objects. Hsiao-yu Chen, Edgar Tretschk, Tuur Stuyck, Petr Kadlecek, Ladislav Kavan, Etienne Vouga, Christoph Lassner, arXiv, 2022. Paper

  13. gradSim: Differentiable simulation for system identification and visuomotor control. Krishna Murthy Jatavallabhula, Miles Macklin, Florian Golemo, Vikram Voleti, Linda Petrini, Martin Weiss, Breandan Considine, Jerome Parent-Levesque, Kevin Xie, Kenny Erleben, Liam Paull, Florian Shkurti, Derek Nowrouzezahrai, Sanja Fidler, International Conference on Learning Representations (ICLR), 2021. Paper

  14. One-Shot Real-to-Sim via End-to-End Differentiable Simulation and Rendering. Yifan Zhu, Tianyi Xiang, Aaron Dollar, Zherong Pan, arXiv, 2024. Paper

  15. Physical Property Understanding from Language-Embedded Feature Fields. Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Paper

  16. GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image. Xiao Fu, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long, arXiv, 2024. GitHub

  17. DensePhysNet: Learning Dense Physical Object Representations via Multi-step Dynamic Interactions. Zhenjia Xu, Jiajun Wu, Andy Zeng, Joshua B. Tenenbaum, Shuran Song, arXiv, 2019. Paper

  18. Visual Grounding of Learned Physical Models. Yunzhu Li, Toru Lin, Kexin Yi, Daniel M. Bear, Daniel L. K. Yamins, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba, International Conference on Machine Learning (ICML), 2020. GitHub

  19. Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids. Yunzhu Li, Jiajun Wu, Russ Tedrake, Joshua B. Tenenbaum, Antonio Torralba, arXiv, 2018. Paper

  20. Physics 101: Learning Physical Object Properties from Unlabeled Videos. Jiajun Wu, Joseph J Lim, Hongyi Zhang, Joshua B Tenenbaum, William T Freeman, British Machine Vision Conference (BMVC), 2016. Paper

  21. Interaction Networks for Learning about Objects, Relations and Physics. Peter W. Battaglia, Razvan Pascanu, Matthew Lai, Danilo Rezende, Koray Kavukcuoglu, Advances in Neural Information Processing Systems (NeurIPS), 2016. Paper

  22. Visual Vibrometry: Estimating Material Properties from Small Motions in Video. Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Oral Büyüköztürk, Frédo Durand, William T. Freeman, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017. Paper

  23. Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction. Vincent Le Guen, Nicolas Thome, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020. Paper

  24. Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language. Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan, Advances In Neural Information Processing Systems (NeurIPS), 2021. GitHub

  25. Flexible Neural Representation for Physics Prediction. Damian Mrowca, Chengxu Zhuang, Elias Wang, Nick Haber, Li Fei-Fei, Joshua B Tenenbaum, Daniel L K Yamins, Advances in Neural Information Processing Systems (NeurIPS), 2018. Paper

  26. Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning. Jiajun Wu, Ilker Yildirim, Joseph J Lim, William T Freeman, Joshua B Tenenbaum Bcs, Advances in Neural Information Processing Systems (NeurIPS), 2015. Paper

  27. GASP: Gaussian Splatting for Physic-Based Simulations. Piotr Borycki, Weronika Smolak, Joanna Waczyńska, Marcin Mazur, Sławomir Tadeja, Przemysław Spurek, arXiv, 2024. Paper

  28. IntPhys 2019: A Benchmark for Visual Intuitive Physics Understanding. Ronan Riochet, Mario Ynocente Castro, Mathieu Bernard, Adam Lerer, Rob Fergus, Veronique Izard, Emmanuel Dupoux, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. Paper

  29. Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction. Mykhaylo Andriluka, Baruch Tabanpour, C. Daniel Freeman, Cristian Sminchisescu, European Conference on Computer Vision (ECCV), 2024. Paper

  30. Learning to See Physics via Visual De-animation. Jiajun Wu, Erika Lu, Pushmeet Kohli, William T Freeman, Joshua B Tenenbaum, Advances in Neural Information Processing Systems (NeurIPS), 2017. Paper

  31. NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos. Yi-Ling Qiao, Alexander Gao, Ming C Lin, Advances in Neural Information Processing Systems, 2022. Paper

  32. Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics. Junyi Cao, Shanyan Guan, Yanhao Ge, Wei Li, Xiaokang Yang, Chao Ma, arXiv, 2024. Paper

  33. Physical Representation Learning and Parameter Identification from Video Using Differentiable Physics. Rama Krishna Kandukuri, Jan Achterhold, Michael Moeller, Joerg Stueckler, International Journal of Computer Vision, 2022. Paper

  34. Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video. Miguel Jaques, Michael Burke, Timothy Hospedales, arXiv, 2019. Paper

  35. Unsupervised Learning for Physical Interaction through Video Prediction. Chelsea Finn, Ian Goodfellow Openai, Sergey Levine, Google Brain, Advances in Neural Information Processing Systems (NeurIPS), 2016. Paper

  36. Visual Interaction Networks: Learning a Physics Simulator from Video. Nicholas Watters, Andrea Tacchetti, Théophane Weber, Razvan Pascanu, Peter Battaglia, Daniel Zoran, Advances in Neural Information Processing Systems (NeurIPS), 2017. Paper

  37. Visual Physics: Discovering Physical Laws from Videos. Pradyumna Chari, Chinmay Talegaonkar, Yunhao Ba, Achuta Kadambi, arXiv, 2019. Paper

  38. Learning to Identify Physical Parameters from Video Using Differentiable Physics. Rama Krishna Kandukuri, Jan Achterhold, Michael Möller, Jörg Stückler, arXiv, 2020. Paper

  39. Scalable Differentiable Physics for Learning and Control. Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin, arXiv, 2020. Paper

Physics Evaluation

  1. Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation. Fanqing Meng, Jiaqi Liao, Xinyu Tan, Wenqi Shao, Quanfeng Lu, Kaipeng Zhang, Yu Cheng, Dianqi Li, Yu Qiao, Ping Luo, arXiv, 2024. GitHub

  2. GAIA: Rethinking Action Quality Assessment for AI-Generated Videos. Zijian Chen, Wei Sun, Yuan Tian, Jun Jia, Zicheng Zhang, Jiarui Wang, Ru Huang, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang, arXiv, 2024. GitHub

  3. MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions. Xuan Ju, Yiming Gao, Zhaoyang Zhang, Ziyang Yuan, Xintao Wang, Ailing Zeng, Yu Xiong, Qiang Xu, Ying Shan, arXiv, 2024. GitHub

  4. Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification. S. P. Sharan, Minkyu Choi, Sahil Shah, Harsh Goel, Mohammad Omama, Sandeep Chinchali, arXiv, 2024. Paper

  5. Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities. Abhijay Ghildyal, Yuanhan Chen, Saman Zadtootaghaj, Nabajeet Barman, Alan C. Bovik, arXiv, 2024. Paper

  6. T2VBench: Benchmarking Temporal Dynamics for Text-to-Video Generation. Pengliang Ji, Chuyang Xiao, Huilin Tai, Mingxiao Huo, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Paper

  7. TlTScore: Towards Long-Tail Effects in Text-to-Visual Evaluation with Generative Foundation Models. Pengliang Ji, Junchen Liu, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. Paper

  8. VBench: Comprehensive Benchmark Suite for Video Generative Models. Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu, arXiv, 2023. Homepage

  9. VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models. Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu, arXiv, 2024. GitHub

  10. VideoPhy: Evaluating Physical Commonsense for Video Generation. Hritik Bansal, Zongyu Lin, Tianyi Xie, Zeshun Zong, Michal Yarom, Yonatan Bitton, Chenfanfu Jiang, Yizhou Sun, Kai-Wei Chang, Aditya Grover, arXiv, 2024. GitHub

  11. VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation. Xuan He, Dongfu Jiang, Ge Zhang, Max Ku, Achint Soni, Sherman Siu, Haonan Chen, Abhranil Chandra, Ziyan Jiang, Aaran Arulraj, Kai Wang, Quy Duc Do, Yuansheng Ni, Bohan Lyu, Yaswanth Narsupalli, Rongqi Fan, Zhiheng Lyu, Yuchen Lin, Wenhu Chen, arXiv, 2024. Homepage

  12. What You See Is What Matters: A Novel Visual and Physics-Based Metric for Evaluating Video Generation Quality. Zihan Wang, Songlin Li, Lingyan Hao, Xinyu Hu, Bowen Song, arXiv, 2024. Paper

  13. WorldSimBench: Towards Video Generation Models as World Simulators. Yiran Qin, Zhelun Shi, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao, Lei Bai, Wanli Ouyang, Ruimao Zhang, arXiv, 2024. GitHub

  14. PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models. Fanqing Meng, Wenqi Shao, Lixin Luo, Yahong Wang, Yiran Chen, Quanfeng Lu, Yue Yang, Tianshuo Yang, Kaipeng Zhang, Yu Qiao, Ping Luo, arXiv, 2024. Paper

  15. T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts. Ziwei Huang, Wanggui He, Quanyu Long, Yandi Wang, Haoyuan Li, Zhelun Yu, Fangxun Shu, Long Chan, Hao Jiang, Leilei Gan, Fei Wu, arXiv, 2024. Paper

  16. ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation. Shenghai Yuan, Jinfa Huang, Yongqi Xu, Yaoyang Liu, Shaofeng Zhang, Yujun Shi, Ruijie Zhu, Xinhua Cheng, Jiebo Luo, Li Yuan, NeurIPS D&B Spotlight, 2024. Paper Github Homepage

About

Physical laws underpin all existence, and harnessing them for generative modeling opens boundless possibilities for advancing science and shaping the future!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published