Convolutional Neural Networks (a.k.a ConvNets or CNNs) are classes of neural networks that are mostly used for image recognition tasks.
-
AlexNet - Deep Convolutional Neural Networks: implementation, paper
-
VGG - Very Deep Convolutional Networks for Large Scale Image Recognition: implementation, paper
-
GoogLeNet(Inceptionv1) - Going Deeper with Convolutions: implementation, paper
-
ResNet - Deep Residual Learning for Image Recognition: implementation, annotated paper paper
-
ResNeXt - Aggregated Residual Transformations for Deep Neural Networks: implementation, annotated paper, paper
-
Xception - Deep Learning with Depthwise Separable Convolutions: implementation, annotated paper, paper
-
DenseNet - Densely Connected Convolutional Neural Networks: implementation, annotated paper, paper
-
MobileNetV1 - Efficient Convolutional Neural Networks for Mobile Vision Applications: implementation, annotated_paper, paper
-
MobileNetV2 - Inverted Residuals and Linear Bottlenecks: implementation annotated paper, paper
-
EfficientNet - Rethinking Model Scaling for Convolutional Neural Networks: implementation, annotated_paper, paper. See also EfficientNetV2
-
ConvNeXt - A ConvNet for the 2020s: implementation, annotated_paper, paper
-
RegNetY - Coming soon
-
ConvMixer - Coming soon
For more about ConvNets, check out this introductory notebook.
- Keras Applications
- Timm
- PyTorch Vision
- ML Tokyo
The implementations of ConvNets architectures contained in this repository are not optimized for training but rather to understand how those networks were designed, principal components that makes them and how they evolved overtime. LeNet-5(LeCunn, 1998) had 5 convolutional layers. AlexNet(Alex, 2012) had 9 convolutional layers. Few years later, Residual Networks(He, 2015) made the trends after showing that it's possible to train networks of over 100 layers. And in fact, residual networks are still one of the most widely used architecture across wide range of visual tasks and it impacted the design of other language architectures. Currently, there are lots going on such as visual attentions.
If you want to use ConvNets for solving a visual recognition tasks such as image classification or object detection, you can get up running quickly by getting the models (and their pretrained weights) from tools like Keras, TensorFlow Hub, PyTorch Vision, Timm, GluoCV, and OpenMML Lab.