Stars
WiLoR hand 3d pose estimation! Simplifying WiLoR into a python package!
High performance self-hosted photo and video management solution.
Geometric Computer Vision Library for Spatial AI
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
TensorRT plugin for 3-dimension grid sample operator
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Code for Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
Inference and training library for high-quality TTS models.
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Open-Sora: Democratizing Efficient Video Production for All
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Foundational Models for State-of-the-Art Speech and Text Translation
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
StableLM: Stability AI Language Models
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Zero-shot Image-to-Image Translation [SIGGRAPH 2023]
A multi-voice TTS system trained with an emphasis on quality