- Image Recognition (CNN)
- image classification
- object detection
- semantic segmentation
- Vision Transformer
- Self-supervised learning
- Weakly-supervised learning
- Depth Estimation
- Vision Language Model
- medical AI
- classification
- segmentation
Name | year | paper | summary | code |
---|---|---|---|---|
AlexNet (ImageNet Classification with Deep Convolutional Neural Networks) | NeurPS 2012 | paper | notion | code |
VGGNet (Very Deep Convolutional Networks For Large-Scale Image Recognition) | ICLR 2015 | paper | notion | code |
ResNet (Deep Residual Learning for Image Recognition) | CVPR 2015 | paper | notion | code |
SENet (Squeeze-and-Excitation Networks) | CVPR 2018 | paper | notion |
Name | year | paper | summary | code |
---|---|---|---|---|
R-CNN (Rich feature hierarchies for accurate object detection and semantiv segmentation) | ILSVRC 2013 | paper | notion | |
Fast R-CNN | 2015 | paper | notion | |
Faster R-CNN (Towards Real-Time Object Detection with Region Proposal Networks) | NIPS 2015 | paper | notion | |
YOLO (You Only Look Once: Unified, Real-Time Object Detection) | 2016 | paper | notion | |
SSD (Single Shot MultiBox Detector) | 2016 | paper | notion |
Name | year | paper | summary | code |
---|---|---|---|---|
FCN (Fully Convolutional Networks for Semantic Segmentation) | CVPR 2015 | paper | tistory | |
U-Net (Convolutional Networks for Biomedical Image Segmentation) | MICCAI 2015 | paper | tistory | code |
SegNet (A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation) | 2015 | paper | tistory | |
DeepLab v1 (Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs) | ICLR 2015 | paper | ||
DeepLab v2 (DeepLab: Semantic Image Segmentation with Deep Convolutioanl Nets, Atrous Convolution, and Fully Connected CRFs) | TPAMI 2017 | paper | ||
DeepLab v3 (Rethinking Atrous Convolution for Semantic Image Segmentation) | 2018 | paper | ||
DeepLab v3+ (Encoder-Decoder with Atrous Separable Convolution for Semantic Image-Segmentation) | ECCV 2018 | paper | ||
PSPNet (Pyramid Scene Parsing Network) | CVPR 2017 | paper |
Name | year | paper | summary | code |
---|---|---|---|---|
ViT (An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale) | ICLR 2021 | paper | tistory | |
Swin Transformer (Hierarchical Vision Transformer using Shifted Windows) | ICCV 2021 | paper | ||
MLP-Mixer (An all-MLP Architecture for Vision) | 2021 | paper | ||
MaxViT (MaxViT: Multi-Axis Vision Transformer) | 2022 | paper |
Name | year | paper | summary | code |
---|---|---|---|---|
Context Prediction (Unsupervised Visual Representation Learning by Context Prediction) | ICCV 2015 | paper | ||
JigSaw (Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles) | 2016 | paper | ||
Colorizations (Colorful Image Colorization) | ECCV 2016 | paper | ||
Rotations (Unsupervised Representation Learning By Predicting Image Rotations) | ICLR 2018 | paper | ||
SimCLR (A Single Framework for Contrastive Learning of Visual Representations) | ICML 2020 | paper | tistory | |
MoCo (Momentum Contrast for Unsupervised Visual Representation Learning) | CVPR 2020 | paper | tistory | |
BYOL (Bootstrap Your Own Latent A New Approach to Self-Supervised Learning) | NeurIPS 2020 | paper | tistory | |
DINO (Emerging Properties in Self-supervised Vision Transformers) | 2021 | paper | tistory | |
SimCLR v2 (Big Self-Supervised Models are Strong Semi-Supervised Learners) | NeurIPS 2020 | paper | tistory | |
MoCo v2 (Improved Baselines with Momentum Contrastive Learning) | 2020 | paper | tistory | |
MoCo v3 (An Empirical Study of Training Self-Supervised Vision Transformers) | ICCV 2021 | paper | tistory | |
SimSiam (Exploring Simple Siamese Representation Learning) | CVPR 2021 | paper | tistory | official |
MAE (Masked Autoencoders Are Scalable Vision Learners) | CVPR 2022 | paper | tistory | official |
SimMIM (SimMIM : a Simple Framework for Masked Image Modeling) | CVPR 2022 | paper | tistory | |
What Do Self-supervised Vision Transformers Learn? | ICLR 2023 | paper | tistory |
Name | year | paper | summary | code |
---|---|---|---|---|
CAM (Learning Deep Features for Discriminative Localization) | CVPR 2016 | paper | tistory | |
DSRG (Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing) | CVPR 2018 | paper | tistory | |
SEAM (Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation) | CVPR 2020 | paper | tistory | |
Learning pseudo labels for semi-and-weakly supervised semantic segmentation | 2022 | paper | tistory |
Name | year | paper | summary | code |
---|---|---|---|---|
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network | NeurIPS 2014 | paper | tistory | |
Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture | ICCV 2015 | paper | tistory | |
Deeper Depth Prediction with Fully Convolutional Residual Networks | 3DV 2016 | paper | tistory | |
Single-Image Depth Perception in the Wild | NeurIPS 2016 | paper | tistory | |
Deep Ordinal Regression Network for Monocular Depth Estimation | CVPR 2018 | paper | ||
Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation | ECCV 2018 | paper | ||
Unsupervised Learning of Depth and Ego-Motion from Video | CVPR 2017 | paper | ||
Unsupervised Monocular Depth Estimation with Left-Right Consistency | CVPR 2017 | paper | ||
Digging Into Self-Supervised Monocular Depth Estimation | ICCV 2019 | paper |
Name | year | paper | summary | code |
---|---|---|---|---|
CLIP (Learning Transferable Visual Models From Natural Language Supervision) | 2021 | paper | tistory | |
CoOp (Conditional Prompt Learning for Vision-Language Models) | CVPR 2022 | paper | ||
Flamingo (Flamingo: a Visual Language Model for Few-Shot Learning) | DeepMind 2022 | paper |
Name | year | paper | summary | code |
---|---|---|---|---|
MICLe (Big Self-Supervised Models Advance Medical Image Classifications) | ICCV 2021 | paper | tistory |
Name | year | paper | summary | code |
---|---|---|---|---|
U-Net (Convolutional Networks for Biomedical Image Segmentation) | MICCAI 2015 | paper | tistory | code |
TransUNet (Transformers Make Strong Encoders for Medical Image Segmentation) | 2021 | paper | tistory | |
UNETR (UNETR: Transformers for 3D Medical Image Segmentation) | 2021 | paper | tistory | |
Swin-Unet (Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation) | 2021 | paper | ||
TransBTS (Multimodal Brain Tumor Segmentation Using Transformer) | 2021 | paper | ||
Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis | CVPR 2022 | paper |