Stars
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
This is the code of ECCV 2022 (Oral) paper "Fine-Grained Scene Graph Generation with Data Transfer".
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
(TCSVT 2024) Official PyTorch implementation of paper "Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling"
(ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"
(ECCV 2024) Official implementation of Paper ''DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation''
A curated list of Action Quality Assessment and related area resources
A comprehensive collection of awesome research and other items about video domain adaptation
(TPAMI2024) Official implementation of Paper ''A Versatile Framework for Multi-scene Person Re-identification''
Point-Pattern Synthesis using Gabor and Random Filters [EGSR 2022]
The first work for cross-domain open-vocabulary action recognition with a benchmark
(CVPR2023) The PyTorch implementation of the "AsyFOD: An Asymmetric Adaptation Paradigm for Few-Shot Domain Adaptive Object Detection".
(ECCV2022) The official PyTorch implementation of the "AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection".
[CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"
Official PyTorch implementation of our ICCV2023 paper “When Prompt-based Incremental Learning Does Not Meet Strong Pretraining”
Official PyTorch implementation of CVPR2022 paper “Learning to Imagine: Diversify Memory for Incremental Learning using Unlabeled Data”