Lists (1)
Sort Name ascending (A-Z)
Starred repositories
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Fine-Grained Open Domain Image Animation with Motion Guidance
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
A simple testbed for robotics manipulation policies
Official repo and evaluation implementation of VSI-Bench
Affordance-based Robot Manipulation with Flow Matching
[NeurIPS'23] Emergent Correspondence from Image Diffusion
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Guided Depth Map Super-resolution: A Survey (ACM CSUR 2023)
A generative world for general-purpose robotics & embodied AI learning.
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Implementation of π₀, the robotic foundation model architecture proposed by Physical Intelligence
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Memory-optimized training scripts for video models based on Diffusers
A collaboration friendly studio for NeRFs
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Official implementation of 'Motion Inversion For Video Customization'
HumanML3D: A large and diverse 3d human motion-language dataset.