Skip to content

gompaang/computer_vision_paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 

Repository files navigation

computer_vision_paper

paper_list

  • Image Recognition (CNN)
    • image classification
    • object detection
    • semantic segmentation
  • Vision Transformer
  • Self-supervised learning
  • Weakly-supervised learning
  • Depth Estimation
  • Vision Language Model
  • medical AI
    • classification
    • segmentation

Image Recognition (CNN)

image classification

Name year paper summary code
AlexNet (ImageNet Classification with Deep Convolutional Neural Networks) NeurPS 2012 paper notion code
VGGNet (Very Deep Convolutional Networks For Large-Scale Image Recognition) ICLR 2015 paper notion code
ResNet (Deep Residual Learning for Image Recognition) CVPR 2015 paper notion code
SENet (Squeeze-and-Excitation Networks) CVPR 2018 paper notion

object detection

Name year paper summary code
R-CNN (Rich feature hierarchies for accurate object detection and semantiv segmentation) ILSVRC 2013 paper notion
Fast R-CNN 2015 paper notion
Faster R-CNN (Towards Real-Time Object Detection with Region Proposal Networks) NIPS 2015 paper notion
YOLO (You Only Look Once: Unified, Real-Time Object Detection) 2016 paper notion
SSD (Single Shot MultiBox Detector) 2016 paper notion

semantic segmentation

Name year paper summary code
FCN (Fully Convolutional Networks for Semantic Segmentation) CVPR 2015 paper tistory
U-Net (Convolutional Networks for Biomedical Image Segmentation) MICCAI 2015 paper tistory code
SegNet (A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation) 2015 paper tistory
DeepLab v1 (Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs) ICLR 2015 paper
DeepLab v2 (DeepLab: Semantic Image Segmentation with Deep Convolutioanl Nets, Atrous Convolution, and Fully Connected CRFs) TPAMI 2017 paper
DeepLab v3 (Rethinking Atrous Convolution for Semantic Image Segmentation) 2018 paper
DeepLab v3+ (Encoder-Decoder with Atrous Separable Convolution for Semantic Image-Segmentation) ECCV 2018 paper
PSPNet (Pyramid Scene Parsing Network) CVPR 2017 paper

Vision Transformer (ViT)

Name year paper summary code
ViT (An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale) ICLR 2021 paper tistory
Swin Transformer (Hierarchical Vision Transformer using Shifted Windows) ICCV 2021 paper
MLP-Mixer (An all-MLP Architecture for Vision) 2021 paper
MaxViT (MaxViT: Multi-Axis Vision Transformer) 2022 paper

Self-supervised learning

Name year paper summary code
Context Prediction (Unsupervised Visual Representation Learning by Context Prediction) ICCV 2015 paper
JigSaw (Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles) 2016 paper
Colorizations (Colorful Image Colorization) ECCV 2016 paper
Rotations (Unsupervised Representation Learning By Predicting Image Rotations) ICLR 2018 paper
SimCLR (A Single Framework for Contrastive Learning of Visual Representations) ICML 2020 paper tistory
MoCo (Momentum Contrast for Unsupervised Visual Representation Learning) CVPR 2020 paper tistory
BYOL (Bootstrap Your Own Latent A New Approach to Self-Supervised Learning) NeurIPS 2020 paper tistory
DINO (Emerging Properties in Self-supervised Vision Transformers) 2021 paper tistory
SimCLR v2 (Big Self-Supervised Models are Strong Semi-Supervised Learners) NeurIPS 2020 paper tistory
MoCo v2 (Improved Baselines with Momentum Contrastive Learning) 2020 paper tistory
MoCo v3 (An Empirical Study of Training Self-Supervised Vision Transformers) ICCV 2021 paper tistory
SimSiam (Exploring Simple Siamese Representation Learning) CVPR 2021 paper tistory official
MAE (Masked Autoencoders Are Scalable Vision Learners) CVPR 2022 paper tistory official
SimMIM (SimMIM : a Simple Framework for Masked Image Modeling) CVPR 2022 paper tistory
What Do Self-supervised Vision Transformers Learn? ICLR 2023 paper tistory

Weakly-supervised learning

Name year paper summary code
CAM (Learning Deep Features for Discriminative Localization) CVPR 2016 paper tistory
DSRG (Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing) CVPR 2018 paper tistory
SEAM (Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation) CVPR 2020 paper tistory
Learning pseudo labels for semi-and-weakly supervised semantic segmentation 2022 paper tistory

Depth Estimation

Name year paper summary code
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network NeurIPS 2014 paper tistory
Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture ICCV 2015 paper tistory
Deeper Depth Prediction with Fully Convolutional Residual Networks 3DV 2016 paper tistory
Single-Image Depth Perception in the Wild NeurIPS 2016 paper tistory
Deep Ordinal Regression Network for Monocular Depth Estimation CVPR 2018 paper
Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation ECCV 2018 paper
Unsupervised Learning of Depth and Ego-Motion from Video CVPR 2017 paper
Unsupervised Monocular Depth Estimation with Left-Right Consistency CVPR 2017 paper
Digging Into Self-Supervised Monocular Depth Estimation ICCV 2019 paper

Vision Language Model (VLM)

Name year paper summary code
CLIP (Learning Transferable Visual Models From Natural Language Supervision) 2021 paper tistory
CoOp (Conditional Prompt Learning for Vision-Language Models) CVPR 2022 paper
Flamingo (Flamingo: a Visual Language Model for Few-Shot Learning) DeepMind 2022 paper

Medical AI

classification

Name year paper summary code
MICLe (Big Self-Supervised Models Advance Medical Image Classifications) ICCV 2021 paper tistory

segmentation

Name year paper summary code
U-Net (Convolutional Networks for Biomedical Image Segmentation) MICCAI 2015 paper tistory code
TransUNet (Transformers Make Strong Encoders for Medical Image Segmentation) 2021 paper tistory
UNETR (UNETR: Transformers for 3D Medical Image Segmentation) 2021 paper tistory
Swin-Unet (Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation) 2021 paper
TransBTS (Multimodal Brain Tumor Segmentation Using Transformer) 2021 paper
Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis CVPR 2022 paper

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published