-
OmniGen Public
Forked from VectorSpaceLab/OmniGenOmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Jupyter Notebook MIT License UpdatedNov 4, 2024 -
OmniParser Public
Forked from microsoft/OmniParserA simple screen parsing tool towards pure vision based GUI agent
Jupyter Notebook Creative Commons Attribution 4.0 International UpdatedNov 1, 2024 -
Meissonic Public
Forked from viiika/MeissonicInference and Training Code of Meissonic
Python Apache License 2.0 UpdatedOct 20, 2024 -
chenxwh.github.io Public
Forked from alshedivat/al-folioA beautiful, simple, clean, and responsive Jekyll theme for academics
-
hart Public
Forked from mit-han-lab/hartHART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Python MIT License UpdatedOct 19, 2024 -
Emu3 Public
Forked from baaivision/Emu3Next-Token Prediction is All You Need
Python Apache License 2.0 UpdatedOct 18, 2024 -
CogView3 Public
Forked from THUDM/CogView3text to image to generation: CogView3-Plus and CogView3(ECCV 2024)
Python Apache License 2.0 UpdatedOct 14, 2024 -
t2v-turbo Public
Forked from Ji4chenLi/t2v-turboCode repository for T2V-Turbo
-
PMRF Public
Forked from ohayonguy/PMRFOfficial implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
Python MIT License UpdatedOct 12, 2024 -
ml-depth-pro Public
Forked from apple/ml-depth-proDepth Pro: Sharp Monocular Metric Depth in Less Than a Second.
-
Lotus Public
Forked from EnVision-Research/LotusOfficial Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
-
UnSAM Public
Forked from frank-xwang/UnSAM[NeurIPS 2024] Code release for "Segment Anything without Supervision"
Jupyter Notebook UpdatedOct 6, 2024 -
DepthCrafter Public
Forked from Tencent/DepthCrafterDepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Python Other UpdatedOct 1, 2024 -
Upscale-A-Video Public
Forked from sczhou/Upscale-A-Video[CVPR 2024] Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
Python Other UpdatedSep 27, 2024 -
CogVLM2 Public
Forked from THUDM/CogVLM2GPT4V-level open-source multi-modal model based on Llama3-8B
Python Apache License 2.0 UpdatedSep 25, 2024 -
CogVideo Public
Forked from THUDM/CogVideoText-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
-
LLaMA-Omni Public
Forked from ictnlp/LLaMA-OmniLLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
-
DiffSynth-Studio Public
Forked from modelscope/DiffSynth-StudioEnjoy the magic of Diffusion models!
-
Depth-Anything-V2 Public
Forked from DepthAnything/Depth-Anything-V2Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
-
Omost Public
Forked from lllyasviel/OmostYour image is almost there!
-
SadTalker Public
Forked from OpenTalker/SadTalker(CVPR 2023)SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
-
HunyuanDiT Public
Forked from Tencent/HunyuanDiTHunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
-
StoryDiffusion Public
Forked from HVision-NKU/StoryDiffusionCreate Magic Story!
-
OpenVoice Public
Forked from myshell-ai/OpenVoiceInstant voice cloning by MyShell.
-
PixArt-sigma Public
Forked from PixArt-alpha/PixArt-sigmaPixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
-
Kandinsky-2 Public
Forked from ai-forever/Kandinsky-2Kandinsky 2 — multilingual text2image latent diffusion model
-
AniPortrait Public
Forked from Zejun-Yang/AniPortraitAniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
-
Smooth-Diffusion Public
Forked from SHI-Labs/Smooth-Diffusion[CVPR 2024] Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
-
-