CVPR-2022-Papers

Website：https://cvpr2022.thecvf.com/

🐱	🐶	🐯	🐺
1.Other	2.Image Segmentation	3.Image Progress	4.Image Captioning
5.Object Detection	6.Object Tracking	7.Point Cloud	8.Action Detection
9.Human Pose Estimation	10.3D	11.Face	12.Image-to-Image Translation
13.GAN	14.Video	15.Transformer	16.Semi/self-supervised learning
17.Medical Image	18.Person Re-Identification	19.Neural Architecture Search	20.Autonomous vehicles
21.UAV/Remote Sensing/Satellite Image	22.Image Synthesis/Generation	23.Image Retrieval	24.Super-Resolution
25.Fine-Grained/Image Classification	26.GCN/GNN	27.Pose Estimation	28.Style Transfer
29.Augmented Reality/Virtual Reality/Robotics	30.Visual Answer Questions	31.Vision-Language	32.Data Augmentation
33.Human-Object Interaction	34.Model Compression/Knowledge Distillation/Pruning	35.OCR	36.Optical Flow
37.Contrastive Learning	38.Meta-Learning	39.Continual Learning	40.Adversarial Learning
41.Incremental Learning	42.Metric Learning	43.Multi-Task Learning	44.Federated Learning
45.Dense Prediction	46.Scene Graph Generation	47.Few/Zero-Shot Learning/Domain Generalization/Adaptation	48.Visual Grounding
49.Image Geo-localization	50.Anomaly Detection	51. Optical, Light Field Imaging	52.Human Motion Forecasting
53.Sign Language Translation	54.Dataset	55.Novel View Synthesis	56.Sound
57.Gaze Estimation	58.Neural rendering	59. Animation	60.Visual Emotion Analysis

Machine Translation

VALHALLA: Visual Hallucination for Machine Translation
🏠project

computer-aided design (CAD)

Neural Face Identification in a 2D Wireframe Projection of a Manifold Object
⭐code

60.Visual Emotion Analysis

MDAN: Multi-level Dependent Attention Network for Visual Emotion Analysis

59.Animation

Image Animation

Thin-Plate Spline Motion Model for Image Animation
Character Animation
- Structured Local Radiance Fields for Human Avatar Modeling
3D character animation(3D Character Animation)
- Skin Prediction
  - SkinningNet: Two-Stream Graph Convolutional Neural Network for Skinning Prediction of Synthetic Characters
    🏠project
3D Dance Generation
- Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

58.Neural rendering

57.Gaze Estimation

GazeOnce: Real-Time Multi-Person Gaze Estimation

56.Sound

Sound Source localization
- Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes
  ⭐code

55.Novel View Synthesis

54.Dataset

53.Sign Language Translation

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation

52.Human Motion Forecasting

51.Optical, geometric, light field imaging

Light Field
- Occlusion-Aware Cost Constructor for Light Field Depth Estimation
  ⭐code📰here
Deep Reconstruction
- Deep Hyperspectral-Depth Reconstruction Using Single Color-Dot Projection
  ⭐code🏠project📺video
Shutter Correction
- Learning Adaptive Warping for Real-World Rolling Shutter Correction
  ⭐code
Thermal Infrared Imaging
- Infrared Invisible Clothing:Hiding from Infrared Detectors at Multiple Angles in Real World
  😮oral

50.Anomaly Detection

49.Image Geo-localization

TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
⭐code
Visual Geolocation
- Rethinking Visual Geo-localization for Large-Scale Applications
  ⭐code
- Deep Visual Geo-localization Benchmark
  😮oral🏠project
Trajectory reconstruction
- MonoTrack: Shuttle trajectory reconstruction from monocular badminton video

48.Visual Grounding

Multi-View Transformer for 3D Visual Grounding
⭐code
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
⭐code
Visual Positioning，Target location through natural language （very interesting research）

47.Few/Zero-Shot Learning/Domain Generalization/Adaptation

Few-Shot
Zero Sample
Domain Generalization
- Compound Domain Generalization via Meta-Knowledge Encoding
- Causality Inspired Representation Learning for Domain Generalization
- Towards Unsupervised Domain Generalization
  Task（domain generalization(DG)），unsupervised learning, unsupervised domain generalization(UDG)。
- out-of-domain generalization
  - The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization
Domain Adaptation
- Continual Test-Time Domain Adaptation
  ⭐code
- Safe Self-Refinement for Transformer-based Domain Adaptation
  ⭐code📰here
- Source-Free Domain Adaptation via Distribution Estimation
  📰here
- Learning Distinctive Margin toward Active Domain Adaptation
  ⭐code
  📰here
- Unsupervised Domain Adaptation
  - Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation
    ⭐code

46.Scene Graph Generation

HL-Net: Heterophily Learning Network for Scene Graph Generatio
⭐code
Heterogeneous Learning Network
📰here
RU-Net: Regularized Unrolling Network for Scene Graph Generation
⭐code
Regular expansion network
📰here

45.Dense Prediction

Does Robustness on ImageNet Transfer to Downstream Tasks?

44.Federated Learning

43.Multi-Task Learning

👉🏼 You can learn about Multi-Task Learning trough my course: https://courses.thinkautonomous.ai/hydranets

42.Metric Learning

Self-Taught Metric Learning without Labels

41.Incremental Learning

incremental learning
- Energy-based Latent Aligner for Incremental Learning
  ⭐code
- General Incremental Learning with Domain-aware Categorical Representations
Class incremental learning

40.Adversarial Learning

39.Continual Learning

38.Meta-Learning

37.Contrastive Learning

36.Optical Flow

👉🏼 You can learn about Optical Flow through: https://courses.thinkautonomous.ai/optical-flow

35.OCR

Scene Text Detection
- Towards End-to-End Unified Scene Text Detection and Layout Analysis
  ⭐code
- Pushing the Performance Limit of Scene Text Recognizer without Human Annotation
- Vision-Language Pre-Training for Boosting Scene Text Detectors
  Visual language pre-training, scene text detection, the code will be open source, the address has not been announced。
Text Spotting
- Text Spotting Transformers
  ⭐code📰here
LOGO设计
- Aesthetic Text Logo Synthesis via Content-aware Layout Inferring
  ⭐code
  📰CVPR 2022
Font Generation
- XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation
- (Oral)Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator
  （A direction of great commercial value）
- Few-Shot Font Generation by Learning Fine-Grained Local Styles
Text Recognition
- Open-set Text Recognition via Character-Context Decoupling
Table structure recognition
- Neural Collaborative Graph Machines for Table Structure Recognition
  📰here

34.Model Compression/Knowledge Distillation/Pruning

Knowledge Distillation
Model Compression
- CHEX: CHannel EXploration for CNN Model Compression
Model Pruning
- Revisiting Random Channel Pruning for Neural Network Compression
  ⭐code
  📰here

👉🏼 You can learn about this entire category through: https://courses.thinkautonomous.ai/neural-optimization

33.Human-Object Interaction

32.Data Augmentation

31.Vision-Language

30.Visual Answer Questions

29.Augmented Reality/Virtual Reality/Robotics

Target Navigation
- Online Learning of Reusable Abstract Models for Object Goal Navigation
try-on
- Dressing in the Wild by Watching Dance Videos
  🏠project
- Style-Based Global Appearance Flow for Virtual Try-On
  ⭐code
- ClothFormer:Taming Video Virtual Try-on in All Module
  😮oral⭐code🏠project📰here
AR
- Episodic Memory Question Answering
  😮oral⭐code
  AI）
Robotics
- Hand-Object Pose Estimation
  - ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis
    ⭐code
    📰paper

28.Style Transfer

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
⭐code
Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation
⭐code
Movement Style Transfer
- Style-ERD: Responsive and Coherent Online Motion Style Transfer
Movement Transfer
- Structure-Aware Motion Transfer with Deformable Anchor Model
  ⭐code📰paper
Scene Stylization
- StylizedNeRF: Consistent 3D Scene Stylization as Stylized NeRF via 2D-3D Mutual Learning

27.Pose Estimation

OSOP: A Multi-Stage One Shot Object Pose Estimation Framework
OnePose: One-Shot Object Pose Estimation without CAD Models
⭐code🏠project📰paper
4D
- Revealing Occlusions with 4D Neural Fields
  😮oral⭐code🏠project
9D
- CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild
  ⭐code📰paper 📓
Monocular Object Pose Estimation
- EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
  ⭐code
6D
3D Object Articulation
- Understanding 3D Object Articulation in Internet Videos
  🏠project
3Dope
- Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions
  ⭐code

26.GCN/GNN

GNN

25.Fine-Grained/Image Classification

Fine-Grained Classification
- Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information
  ⭐code📰here 📓
Image Classification
- DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification
  ⭐code
- Contrastive Test-Time Adaptation
  🏠project
Few-Shot Classification
- CAD: Co-Adapting Discriminative Features for Improved Few-Shot Classification
- Matching Feature Sets for Few-Shot Image Classification
  ⭐code🏠project📺video
- Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification
  😮oral⭐code🏠project📰here
- Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification
  📰here
- Generating Representative Samples for Few-Shot Classification
  ⭐code
  📰here
  SOTA。
- Few-Shot Classification & Segmentation(FS-CS)
  - Integrative Few-Shot Learning for Classification and Segmentation
Long Tail Recognition
Fine-Grained Identification
- Knowledge Mining with Scene Text for Fine-Grained Recognition
  ⭐code📰here

24.Super-Resolution

23.Image Retrieval

Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval
⭐code
Correlation Verification for Image Retrieval
😮oral⭐code
Sketch3T: Test-Time Training for Zero-Shot SBIR
Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image
Text-Video Retrieval
- X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
  🏠project
  📰X-Pool
- Bridging Video-text Retrieval with Multiple Choice Questions
  ⭐code
  📰《BridgeFormer
Cross-Modal Search
- ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval

22.Image Synthesis/Generation

Interactive Image Synthesis with Panoptic Layout Generation
Autoregressive Image Generation using Residual Quantization
⭐code📰here
GIRAFFE HD: A High-Resolution 3D-aware Generative Model
Arbitrary-Scale Image Synthesis
⭐code📰here
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
⭐code📰here
Learning to Memorize Feature Hallucination for One-Shot Image Generation
📰here
Text-Guided Image processing
- ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation
  😮oral🏠project
Pose-Guided Image Synthesis
- Exploring Dual-task Correlation for Pose Guided Person Image Generation
  ⭐code📰here
Text-To-Image Synthesis
Image Translation
- FlexIT: Towards Flexible Semantic Image Translation
- A Style-aware Discriminator for Controllable Image Translation
Image Generation

21.UAV/Remote Sensing/Satellite Image

Remote Sensing Image Fusion
- HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening
  ⭐code📰粗解
Aerial Image Segmentation
- Revisiting Near/Remote Sensing with Geospatial Attention

20.Autonomous vehicles

Autopilot
Lane Line Detection
- Rethinking Efficient Lane Detection via Curve Modeling
  ⭐code📰here
  📓
- Towards Driving-Oriented Metric for Lane Detection Models
- A Keypoint-based Global Association Network for Lane Detection
  ⭐code📰here
- Monocular 3D Lane Detection
  - ONCE-3DLanes: Building Monocular 3D Lane Detection
    ⭐code
Lane Line Description
- Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes
  ⭐code
- CLRNet: Cross Layer Refinement Network for Lane Detection
  📰here
Behavior Predictioj
- 🐦️JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection
Autopilot Scene Relighting
- SIMBAR: Single Image-Based Scene Relighting For Effective Data Augmentation For Automated Driving Vision Tasks
  🏠project

👉🏼 You can learn about Autonomous Vehicles trough my course: https://courses.thinkautonomous.ai/self-driving-cars

19.Neural Architecture Search

18.Person Re-Identification

Reid
Crowd Counting
Pedestrian Detection
- STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes
  ⭐code
Gait Recognition
- Gait Recognition in the Wild with Dense 3D Representations and A Benchmark
  ⭐code🏠project
  📰paper
Person Search
- PSTR: End-to-End One-Step Person Search With Transformers
  ⭐code

17.Medical Image

16.Semi/self-supervised learning

15.Transformer

14.Video

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
😮oral
Action Segmentation
- Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering
  📺video
- Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
Action Understanding
- How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
- Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
  ⭐code
Video Copy Detection
- A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection
  ⭐code
Video Synthesis
- Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
  ⭐code
- 3D Moments from Near-Duplicate Photos
  🏠project
Video Anomaly Detection
- Generative Cooperative Learning for Unsupervised Video Anomaly Detection
- Bayesian Nonparametric Submodular Video Partition for Robust Anomaly Detection
Video Surveillance
- Trajectory Prediction
Video Moment Retrieval & Highlight Detection
- UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
  ⭐code
- Learning Pixel-Level Distinctions for Video Highlight Detection
Video Moment Retrieval
- AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
Video POrediction
- STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction
- Continual Predictive Learning from Videos
  😮oral⭐code
Video Individual Count
- DR.VIC: Decomposition and Reasoning for Video Individual Counting
  ⭐code
Frame Interpolation
- Many-to-many Splatting for Efficient Video Frame Interpolation
  ⭐code
- TimeReplayer: Unlocking the Potential of Event Cameras for Video Interpolation
- Long-term Video Frame Interpolation via Feature Propagation
- Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion
Visual Correspondence
- Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning
  ⭐code
Video Recognition
- BEVT: BERT Pretraining of Video Transformers
  ⭐code📰Transformer
Video Classification
- Zero-Shot
  - Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification
Video Prediction
- Hand Motion Prediction
  - Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
    🏠project📺video
Video Segmentation
- Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
  ⭐code
- VOS
  - Recurrent Dynamic Embedding for Video Object Segmentation
    ⭐code
- Video Instance Segmentation(VIS)
  - Efficient Video Instance Segmentation via Tracklet Query and Proposal
    🏠project📺video📰粗解
  - Temporally Efficient Vision Transformer for Video Instance Segmentation
    😮oral⭐code📰解读
- Video Semantic Segmentation
  - Coarse-to-Fine Feature Mining for Video Semantic Segmentation
    ⭐code
- Video Panorama Segmentation
  - Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
    😮oral⭐code📰解读
Video Image Processing
- Super Resolution
- Video Recovery
  - Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature
    ⭐code
- Video Repair
  - Towards An End-to-End Framework for Flow-Guided Video Inpainting
- Video to Moiré
  - Video Demoireing with Relation-Based Temporal Consistency
    🏠project📺video
- Deblur
  - Multi-Scale Memory-Based Video Deblurring
- Denoising
  - Dancing under the stars: video denoising in starlight
    ⭐code
- Movie Restoration
  - Bringing Old Films Back to Life
    ⭐code
Representation Learning (Video)
- TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
  😮oral⭐code📰here
- Self-Supervised Representation Learning
- Video Comparison Learning
  - Probabilistic Representations for Video Contrastive Learning
Video Decomposition
- Deformable Sprites for Unsupervised Video Decomposition
  😮oral🏠project
Shadow Detection
- Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training
  ⭐code
Frame Interpolation
- IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation
  📰here
- Video Frame Interpolation with Transformer
  ⭐code
  📰here
VSS
- Scene Consistency Representation Learning for Video Scene Segmentation
  ⭐code
  📰here
VSR
- Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning
  ⭐code
  📰here
Video Reconstruction
- Context-Aware Video Reconstruction for Rolling Shutter Cameras
  ⭐code📰here

13.GAN

🐦️HyperInverter: Improving StyleGAN Inversion via Hypernetwork
🏠project
InsetGAN for Full-Body Image Generation
🏠project
📰1024x1024 Inset GAN
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data
⭐code
Deep Image-based Illumination Harmonization
GAN-Supervised Dense Visual Alignment
😮oral⭐code🏠project📺video
📰CVPR2022 Oral：GAN
HairMapper: Removing Hair from Portraits Using GANs
⭐code
Image Tampering Detection
- Proactive Image Manipulation Detection
  ⭐code
Hair Editing
- HairCLIP: Design Your Hair by Text and Reference Image
  ⭐code

12.Image-to-Image Translation

11.Face

Protecting Celebrities with Identity Consistency Transformer
Deepfake
- Voice-Face Homogeneity Tells Deepfake
  ⭐code📰here
Makeup Migration
- Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer
Face Recognition
- Local-Adaptive Face Recognition via Graph-based Meta-Clustering and Regularized Adaptation
- Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC
  ⭐code
- AdaFace: Quality Adaptive Margin for Face Recognition
  😮oral⭐code
Facial Expression Recognition
- Towards Semi-Supervised Deep Facial Expression Recognition with An Adaptive Confidence Margin
  ⭐code
3D Face Modeling
- ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations
- Learning to Restore 3D Face from In-the-Wild Degraded Images
  📰here
Liveness Detection
- PatchNet: A Simple Face Anti-Spoofing Framework via Fine-Grained Patch Recognition
DeepFake Detection
- Exploring Frequency Adversarial Attacks for Face Forgery Detection
  📰here
Face Swap
- High-resolution Face Swapping via Latent Semantics Disentanglement
  ⭐code
Face Attribute Classification
- Fair Contrastive Learning for Facial Attribute Classification
  ⭐code
Face Relighting
- Face Relighting with Geometrically Consistent Shadows
Face Editing
- TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing
  ⭐code🏠project
Face Illusion
- Escaping Data Scarcity for High-Resolution Heterogeneous Face Hallucination
Deepfake
- Detecting Deepfakes with Self-Blended Images
  😮oral⭐code
Face Reconstruction
- JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human Reconstruction
  ⭐code🏠project📰here
Face Capture
- EMOCA: Emotion Driven Monocular Face Capture and Animation
  🏠project
Change Head
- Few-Shot Head Swapping in the Wild
  😮oral⭐code🏠project📺video📰here
Portrait Distortion Correction
- Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer
  ⭐code📰here
3D Face Modeling
- Physically-guided Disentangled Implicit Rendering for 3D Face Modeling
  📰here
Face Repair
- Blind Face Restoration via Integrating Face Shape and Generative Priors
  📰here

10.3D Vision

Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images
Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows
⭐code
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
😮oral⭐code📰here
Stereo Merging
Depth Estimation
Room Layout
- LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network
  ⭐code📰here
MVS
3D Reconstruction
- PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo
- Self-supervised Neural Articulated Shape and Appearance Models
  🏠project
- BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion
- Topologically-Aware Deformation Fields for Single-View 3D Reconstruction
  ⭐code🏠project
- Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction
  ⭐code🏠project📰here
- What's in your hands? 3D Reconstruction of Generic Objects in Hands
  ⭐code🏠project📺video📰here
- Surface Reconstruction from Point Clouds by Learning Predictive Context Priors
  ⭐code
- FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction
  ⭐code
  📰here
- 3D Scene Reconstruction
  - Neural 3D Scene Reconstruction with the Manhattan-world Assumption
    😮oral⭐code🏠project📺video📰here
- Hand Reconstruction
  - Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution
- 3D Garment Mesh Reconstruction
  - Registering Explicit to Implicit: Towards High-Fidelity Garment mesh Reconstruction from Single Images
    🏠project
  - Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing
    🏠project
3D Shape Reconstruction
- 3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow
- GIFS: Neural Implicit Function for General Shape Representation
  🏠project
3D Garment Deformation
- SNUG: Self-Supervised Neural Dynamic Garments
  😮oral⭐code
Texture Migration & Compositing
- AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis
  ⭐code🏠project📺video
Shape Matching
- A Scalable Combinatorial Solver for Elastic Geometrically Consistent 3D Shape Matching
  ⭐code
- Deep Orientation-Aware Functional Maps: Tackling Symmetry Issues in Shape Matching
  ⭐code

👉🏼 You can learn about Stereo & 3D Vision here: https://courses.thinkautonomous.ai/stereo-vision

9.Human Pose Estimation

COAP: Compositional Articulated Occupancy of People
⭐code🏠project📺video📰here
Context-Aware Sequence Alignment using 4D Skeletal Augmentation
😮oral⭐code🏠project
Multi-Person Pose Estimation
- Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation
Video-Based HPE
- Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation
  ::oral:star:code
3D pose
4D Human Capture
- H4D: Human 4D Modeling by Learning Neural Compositional Representation
Gesture Generation
- Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
3D Hand Grid Estimation
- HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network
3D Shape Generation
- Towards Implicit Text-Guided 3D Shape Generation
- 3D Dog Shape
  - BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information
    🏠project
Motion Capture
- Neural MoCon: Neural Motion Control for Physically Plausible Human Motion Capture
  🏠project
Arm-Hand Dynamic Estimation
- Spatial-Temporal Parallel Transformer for Arm-Hand Dynamic Estimation
3D Hand reconstruction
- LISA: Learning Implicit Shape and Appearance of Hands
  🏠project
3D Body Shape
- OSSO: Obtaining Skeletal Shape from Outside
  ⭐code🏠project📺video📰here
Dense correspondence
- BodyMap: Learning Full-Body Dense Correspondence Map
  🏠project
3D body movement reconstruction
- Differentiable Dynamics for Articulated 3d Human Motion Reconstruction
3D Human Pose Reconstruction
- Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video

8.Action Detection

Motion Detection
- Colar: Effective and Efficient Online Action Detection by Consulting Exemplars
- Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
- End-to-End Semi-Supervised Learning for Video Action Detection
- SPAct: Self-supervised Privacy Preservation for Action Recognition
  ⭐code
- Temporal Alignment Networks for Long-term Video
  😮oral⭐code🏠project📰paper
- SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition
- Zero-Shot Action Recognition
  - Cross-modal Representation Learning for Zero-shot Action Recognition
    ⭐code
- Few-Shot Action Recognition
  - Hybrid Relation Guided Set Matching for Few-shot Action Recognition
    ⭐code📰here
- Sequence Action Detection
  - An Empirical Study of End-to-End Temporal Action Detection
    ⭐code📰paper
Timing action positioning
Repetitive Action Count
- TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
  😮oral⭐code🏠project
Group Action Recognition
- Dual-AI: Dual-path Action Interaction Learning for Group Activity Recognition
  😮oral
- Detector-Free Weakly Supervised Group Activity Recognition
Action Quality Assessment
- FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
  😮oral⭐code🏠project📰here

7.Point Clouds

Shape-invariant 3D Adversarial Point Clouds
⭐code
AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception
REGTR: End-to-end Point Cloud Correspondences with Transformers
⭐code
Equivariant Point Cloud Analysis via Learning Orientations for Message Passing
⭐code
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Deformation and Correspondence Aware Unsupervised Synthetic-to-Real Scene Flow Estimation for Point Clouds
⭐code
Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit Neural Representation
⭐code📰paper
3DeformRS: Certifying Spatial Deformations on Point Clouds
⭐code
Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors
⭐code📰paper
Density-preserving Deep Point Cloud Compression
⭐code🏠project📰paper
Surface Representation for Point Clouds
😮oral⭐code
📰paper
3D 点云
- CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding
  ⭐code📰paper
  CrossPoint，3D...
- A Unified Query-based Paradigm for Point Cloud Understanding
- WarpingGAN: Warping Multiple Uniform Priors for Adversarial 3D Point Cloud Generation
  ⭐code
- 3D Point Cloud Segmentation
  - Stratified Transformer for 3D Point Cloud Segmentation
    ⭐code
Point Cloud Classification
- ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation
  ⭐code📰paper 📓
Point Cloud Registration
- SC^2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration
  ⭐code
  📰(https://mp.weixin.qq.com/s/pOVgC4nvE4YCxe3hyLmkGA)
Point Cloud Completion
Point Cloud Segmentation
- Contrastive Boundary Learning for Point Cloud Segmentation
  ⭐code📰paper
- SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation
  ⭐code📰paper
Scene Flow Estimation
- RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

👉🏼 You can learn more about point clouds through my intro-course https://courses.thinkautonomous.ai/point-clouds

6.Object Tracking

TCTrack: Temporal Contexts for Aerial Tracking
⭐code📰paper
Correlation-Aware Deep Tracking
Global Tracking Transformers
⭐code
Unified Transformer Tracker for Object Tracking
⭐code
Global Tracking via Ensemble of Local Trackers
Unsupervised Learning of Accurate Siamese Tracking
⭐code
Transformer Tracking with Cyclic Shifting Window Attention
⭐code
Transformer, UAV123, LaSOT, TrackingNet, GOT-10k SOTA.
3D Target Tracking
- Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds
  ⭐code📰paper
Multi-Target Tracking
- Learning of Global Objective for Network Flow in Multi-Object Tracking
- MeMOT: Multi-Object Tracking with Memory
  😮oral
RGB-Tracking
- Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline
  🏠project📰paper
Visual Tracking
- Ranking-Based Siamese Visual Tracking
  ⭐code📰paper

👉🏼 Learn more about object tracking though my course https://courses.thinkautonomous.ai/obstacle-tracking

5.Object Detection

4.Image Captioning

3.Image Progress

Image Restoration
- Attentive Fine-Grained Structured Sparsity for Image Restoration
  ⭐code📰paper
Image Restoration
Image Stitching
- Deep Rectangling for Image Stitching: A Learning Baseline
  😮oral⭐code📰paper
Motion Deblur
- Unifying Motion Deblurring and Frame Interpolation with Events
image outpainting
- Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation
  🏠project
Image Aesthetic Evaluation
- Personalized Image Aesthetics Assessment with Rich Attributes
  🏠project
Image Quality Assessment
- Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment
  ⭐code📰paper
Image To Rain
- Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond
  ⭐code
Image Deblurring
- Learning to Deblur using Light Field Generated and Real Defocus Images
  ⭐code🏠project
Image Denoising
- CVF-SID: Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image
  ⭐code
- NAN: Noise-Aware NeRFs for Burst-Denoising
Image Enhancement
Image Harmonization
- SCS-Co: Self-Consistent Style Contrastive Learning for Image Harmonization
  ⭐code
Image Super Completion
- Scene Graph Expansion for Semantics-Guided Image Outpainting
Semantic Image Matching
- TransforMatcher: Match-to-Match Attention for Semantic Correspondence
  📰paper

2.Image Segmentation

FocalClick: Towards Practical Interactive Image Segmentation
⭐code📰paper
Semantic-Aware Domain Generalized Segmentation
😮oral⭐code
ReSTR: Convolution-free Referring Image Segmentation Using Transformers
Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
🏠project
Instance Segmentation
Semantic Segmentation
Action Segmentation
- Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
Scene Analysis
- FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing
  ⭐code
FOG Segmentation
- FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation
  😮oral
Panoptic Segmentation
Cutout
- Human Instance Matting via Mutual Guidance and Multi-Instance Refinement
  😮oral⭐code

👉🏼 Learn Image Segmentation here: https://courses.thinkautonomous.ai/image-segmentation

1.Other

Papers Not Yet Published (not translated)

Camera Relocation
- ❌[SceneSqueezer: Learning to Compress Scene for Camera Relocalization]
  😮oral
Camera Imaging
- ❌[Learning to Zoom Inside Camera Imaging Pipeline]
Homography Estimation(旋转估计)
- ❌[Unsupervised Homography Estimation with Coplanarity-Aware GAN]
  ⭐code📰解读
3D人体重建
- ❌[Putting People in their Place: Monocular Regression of 3D People in Depth]
  ⭐code📰解读
图像字幕
- ❌[Comprehending and Ordering Semantics for Image Captioning]
  📰解读
图像去雾
- ❌[Self-augmented Unpaired Image Dehazing via Density and Depth Decomposition]
  📰解读
图像到图像翻译
- ❌[Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint]
  📰解读
光流
- ❌[Learning Optical Flow with Kernel Patch Attention]
  ⭐code📰解读
图像生成
- ❌[Modeling Image Composition for Complex Scene Generation]
  📰解读
连续学习
- ❌[Continual Learning with Lifelong Vision Transformer]
  📰解读
元学习
- ❌[Learning to Learn and Remember Super Long Multi-Domain Task Sequence]
  📰解读
目标检测
- ❌[Voxel Field Fusion for 3D Object Detection]
  📰解读
- ❌[ISNet: Shape Matters for Infrared Small Target Detection]
  📰解读
HOI
- ❌[Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection]
  📰解读
视频建模
- ❌[Stand-Alone Inter-Frame Attention in Video Models]
  📰解读
其他
- ❌[RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging]
  ⭐code
- ❌[Learning to Collaborate in Decentralized Learning of Personalized Models]
  📰解读
- ❌[MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing]
  📰解读
视频场景分割
- ❌[Scene Consistency Representation Learning for Video Scene Segmentation]
  📰解读
图像字幕
- ❌[DIFNet: Boosting Visual Information Flow for Image Captioning]
  📰解读
姿态
- ❌[Location-Free Human Pose Estimation]
  📰解读
小样本
- ❌[Ranking-Guided Distance Calibration for Cross-Domain Few-Shot Learning]
  📰解读
- ❌[En-Compactness: Self-Distillation Embedding & Contrastive Generation for Generalized Zero-Shot Learning]
  📰解读
点云
- ❌[Surface Representation for Point Clouds]
  📰解读
- ❌[Deterministic Point Cloud Registration via Novel Transformation Decomposition]
  📰解读
人脸
- ❌[Evaluation-oriented Knowledge Distillation for Deep Face Recognition]
  📰解读
- ❌[End-to-End Reconstruction-Classification Learning for Face Forgery Detection]
  📰解读
目标检测
- ❌[Thinking Camouflaged Object Detection in Frequency]
  📰解读
对抗
- ❌[Efficent Data-free Model Stealing for Black-box Adversarial Attacks]
  📰解读
分割
- ❌[ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation]
  📰解读
- ❌[HybridCR: Weakly-Supervised 3D Point Cloud Semantic Segmentation via Hybrid Contrastive Regularization]
  📰解读
3D场景
- ❌[Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes]
  📰paper
行人轨迹预测
- ❌[Human Trajectory Prediction with Momentary Observation]
  📰paper

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

ID:Cyelie multi-Variate Function for self-supervised image denoising by disentangling noise form image

Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation

Source
[Two Systems in Thinking: Dual-System Transformer for Grounded Situation Recognition]
[Autoregressive Image Generation using Residual Quantization]
✔️Instance-wise Occlusion and Depth Orders in Natural Scenes
[Style Neophile: Constantly Seeking Novel Styles for Domain Generalization]
[ReSTR: Convolution-free Referring Image Segmentation Using Transformers]
[FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation]
[TransforMatcher: Match-to-Match Attention for Semantic Correspondence]
[Reflection and Rotation Symmetry Detection via Equivariant Learning]
[Semi-supervised Semantic Segmentation with Error Localization Network]
[Future Transformer for Long-term Action Anticipation]
[Self-Taught Metric Learning without Labels]
✔️Fast Point Transformer
[Integrative Few-Shot Learning for Classification and Segmentation]
[Scene Painting via Semantic Image Synthesis]
[Detector-Free Weakly Supervised Group Activity Recognition]

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
README.md		README.md

Jeremy26/CVPR-2022-Papers-EN

Folders and files

Latest commit

History

Repository files navigation