Starred repositories
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024
[ECCV 2024] Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance
Implicit Style-Content Separation using B-LoRA
DNO: Optimizing Diffusion Noise Can Serve As Universal Motion Priors
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
Official code repository for the paper: Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
[CoRL22] Motion Style Transfer: Modular Low-Rank Adaptation for Deep Motion Forecasting
This repository contains a pytorch implementation of "MHPro: Multi-Hypothesis Probabilistic Modeling for Human Mesh Recovery".
Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models
MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos
download mpi_inf_3dhp database, CNN-based approach for 3D human body pose estimation from single RGB images
Contains the MPI-3D-HP data sets as a .zip file
Official code of "HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation", CVPR 2021
[ECCV'22] Official PyTorch Implementation of "Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers"
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.
[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner
[ICLR2022] official implementation of UniFormer
RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks
Scaling RWKV-Like Architectures for Diffusion Models
A curated list of papers on the applications of RWKV in computer vision.