Add Inception-ResNet v2 to application (keras-team#7753)

* Add Inception-ResNet v2 to application * InceptionResNetV2 is not supported on CNTK (backend issues) * Fewer layer names; remove dropout; update FAQ doc * Docstrings; remove dropout_keep_prob in doc * Format image_data_format as strings in doc; add test for channels_first
sauerburger · Sep 8, 2017 · 19862b0 · 19862b0
1 parent b76571f
commit 19862b0
Show file tree

Hide file tree

Showing 5 changed files with 560 additions and 56 deletions.
diff --git a/docs/templates/applications.md b/docs/templates/applications.md
@@ -14,6 +14,7 @@ Weights are downloaded automatically when instantiating a model. They are stored
 - [VGG19](#vgg19)
 - [ResNet50](#resnet50)
 - [InceptionV3](#inceptionv3)
+- [InceptionResNetV2](#inceptionresnetv2)
 - [MobileNet](#mobilenet)
 
 All of these architectures (except Xception and MobileNet) are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image data format set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_data_format=channels_last`, then any model loaded from this repository will get built according to the TensorFlow data format convention, "Height-Width-Depth".
@@ -172,6 +173,7 @@ model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=T
 | [VGG19](#vgg19) | 549 MB | 0.727 | 0.910 | 143,667,240 | 26
 | [ResNet50](#resnet50) | 99 MB | 0.759 | 0.929 | 25,636,712 | 168
 | [InceptionV3](#inceptionv3) | 92 MB | 0.788 | 0.944 | 23,851,784 | 159 |
+| [InceptionResNetV2](#inceptionresnetv2) | 215 MB | 0.804 | 0.953 | 55,873,736 | 572 |
 | [MobileNet](#mobilenet) | 17 MB | 0.665 | 0.871 | 4,253,864 | 88
 
 
@@ -194,17 +196,17 @@ and a top-5 validation accuracy of 0.945.
 
 Note that this model is only available for the TensorFlow backend,
 due to its reliance on `SeparableConvolution` layers. Additionally it only supports
-the data format "channels_last" (height, width, channels).
+the data format `'channels_last'` (height, width, channels).
 
 The default input size for this model is 299x299.
 
 ### Arguments
 
 - include_top: whether to include the fully-connected layer at the top of the network.
-- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
+- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
 - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
 - input_shape: optional shape tuple, only to be specified
-    if `include_top` is False (otherwise the input shape
+    if `include_top` is `False` (otherwise the input shape
     has to be `(299, 299, 3)`.
     It should have exactly 3 inputs channels,
     and width and height should be no smaller than 71.
@@ -214,19 +216,19 @@ The default input size for this model is 299x299.
     - `None` means that the output of the model will be
         the 4D tensor output of the
         last convolutional layer.
-    - `avg` means that global average pooling
+    - `'avg'` means that global average pooling
         will be applied to the output of the
         last convolutional layer, and thus
         the output of the model will be a 2D tensor.
-    - `max` means that global max pooling will
+    - `'max'` means that global max pooling will
         be applied.
 - classes: optional number of classes to classify images 
-    into, only to be specified if `include_top` is True, and 
+    into, only to be specified if `include_top` is `True`, and 
     if no `weights` argument is specified.
 
 ### Returns
 
-A Keras model instance.
+A Keras `Model` instance.
 
 ### References
 
@@ -249,19 +251,19 @@ keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tenso
 VGG16 model, with weights pre-trained on ImageNet.
 
 This model is available for both the Theano and TensorFlow backend, and can be built both
-with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
+with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).
 
 The default input size for this model is 224x224.
 
 ### Arguments
 
 - include_top: whether to include the 3 fully-connected layers at the top of the network.
-- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
+- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
 - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
 - input_shape: optional shape tuple, only to be specified
-    if `include_top` is False (otherwise the input shape
-    has to be `(224, 224, 3)` (with `channels_last` data format)
-    or `(3, 224, 224)` (with `channels_first` data format).
+    if `include_top` is `False` (otherwise the input shape
+    has to be `(224, 224, 3)` (with `'channels_last'` data format)
+    or `(3, 224, 224)` (with `'channels_first'` data format).
     It should have exactly 3 inputs channels,
     and width and height should be no smaller than 48.
     E.g. `(200, 200, 3)` would be one valid value.
@@ -270,19 +272,19 @@ The default input size for this model is 224x224.
     - `None` means that the output of the model will be
         the 4D tensor output of the
         last convolutional layer.
-    - `avg` means that global average pooling
+    - `'avg'` means that global average pooling
         will be applied to the output of the
         last convolutional layer, and thus
         the output of the model will be a 2D tensor.
-    - `max` means that global max pooling will
+    - `'max'` means that global max pooling will
         be applied.
 - classes: optional number of classes to classify images 
-    into, only to be specified if `include_top` is True, and 
+    into, only to be specified if `include_top` is `True`, and 
     if no `weights` argument is specified.
-    
+
 ### Returns
 
-A Keras model instance.
+A Keras `Model` instance.
 
 ### References
 
@@ -305,19 +307,19 @@ keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tenso
 VGG19 model, with weights pre-trained on ImageNet.
 
 This model is available for both the Theano and TensorFlow backend, and can be built both
-with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
+with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).
 
 The default input size for this model is 224x224.
 
 ### Arguments
 
 - include_top: whether to include the 3 fully-connected layers at the top of the network.
-- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
+- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
 - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
 - input_shape: optional shape tuple, only to be specified
-    if `include_top` is False (otherwise the input shape
-    has to be `(224, 224, 3)` (with `channels_last` data format)
-    or `(3, 224, 224)` (with `channels_first` data format).
+    if `include_top` is `False` (otherwise the input shape
+    has to be `(224, 224, 3)` (with `'channels_last'` data format)
+    or `(3, 224, 224)` (with `'channels_first'` data format).
     It should have exactly 3 inputs channels,
     and width and height should be no smaller than 48.
     E.g. `(200, 200, 3)` would be one valid value.
@@ -326,19 +328,19 @@ The default input size for this model is 224x224.
     - `None` means that the output of the model will be
         the 4D tensor output of the
         last convolutional layer.
-    - `avg` means that global average pooling
+    - `'avg'` means that global average pooling
         will be applied to the output of the
         last convolutional layer, and thus
         the output of the model will be a 2D tensor.
-    - `max` means that global max pooling will
+    - `'max'` means that global max pooling will
         be applied.
 - classes: optional number of classes to classify images 
-    into, only to be specified if `include_top` is True, and 
+    into, only to be specified if `include_top` is `True`, and 
     if no `weights` argument is specified.
-    
+
 ### Returns
 
-A Keras model instance.
+A Keras `Model` instance.
 
 
 ### References
@@ -362,20 +364,20 @@ keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input
 ResNet50 model, with weights pre-trained on ImageNet.
 
 This model is available for both the Theano and TensorFlow backend, and can be built both
-with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
+with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).
 
 The default input size for this model is 224x224.
 
 
 ### Arguments
 
 - include_top: whether to include the fully-connected layer at the top of the network.
-- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
+- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
 - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
 - input_shape: optional shape tuple, only to be specified
-    if `include_top` is False (otherwise the input shape
-    has to be `(224, 224, 3)` (with `channels_last` data format)
-    or `(3, 224, 224)` (with `channels_first` data format).
+    if `include_top` is `False` (otherwise the input shape
+    has to be `(224, 224, 3)` (with `'channels_last'` data format)
+    or `(3, 224, 224)` (with `'channels_first'` data format).
     It should have exactly 3 inputs channels,
     and width and height should be no smaller than 197.
     E.g. `(200, 200, 3)` would be one valid value.
@@ -384,19 +386,19 @@ The default input size for this model is 224x224.
     - `None` means that the output of the model will be
         the 4D tensor output of the
         last convolutional layer.
-    - `avg` means that global average pooling
+    - `'avg'` means that global average pooling
         will be applied to the output of the
         last convolutional layer, and thus
         the output of the model will be a 2D tensor.
-    - `max` means that global max pooling will
+    - `'max'` means that global max pooling will
         be applied.
 - classes: optional number of classes to classify images 
-    into, only to be specified if `include_top` is True, and 
+    into, only to be specified if `include_top` is `True`, and 
     if no `weights` argument is specified.
-    
+
 ### Returns
 
-A Keras model instance.
+A Keras `Model` instance.
 
 ### References
 
@@ -418,20 +420,20 @@ keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet'
 Inception V3 model, with weights pre-trained on ImageNet.
 
 This model is available for both the Theano and TensorFlow backend, and can be built both
-with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
+with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).
 
 The default input size for this model is 299x299.
 
 
 ### Arguments
 
 - include_top: whether to include the fully-connected layer at the top of the network.
-- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
+- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
 - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
 - input_shape: optional shape tuple, only to be specified
-    if `include_top` is False (otherwise the input shape
-    has to be `(299, 299, 3)` (with `channels_last` data format)
-    or `(3, 299, 299)` (with `channels_first` data format).
+    if `include_top` is `False` (otherwise the input shape
+    has to be `(299, 299, 3)` (with `'channels_last'` data format)
+    or `(3, 299, 299)` (with `'channels_first'` data format).
     It should have exactly 3 inputs channels,
     and width and height should be no smaller than 139.
     E.g. `(150, 150, 3)` would be one valid value.
@@ -440,19 +442,19 @@ The default input size for this model is 299x299.
     - `None` means that the output of the model will be
         the 4D tensor output of the
         last convolutional layer.
-    - `avg` means that global average pooling
+    - `'avg'` means that global average pooling
         will be applied to the output of the
         last convolutional layer, and thus
         the output of the model will be a 2D tensor.
-    - `max` means that global max pooling will
+    - `'max'` means that global max pooling will
         be applied.
 - classes: optional number of classes to classify images 
-    into, only to be specified if `include_top` is True, and 
+    into, only to be specified if `include_top` is `True`, and 
     if no `weights` argument is specified.
-    
+
 ### Returns
 
-A Keras model instance.
+A Keras `Model` instance.
 
 ### References
 
@@ -464,6 +466,62 @@ These weights are released under [the Apache License](https://github.com/tensorf
 
 -----
 
+## InceptionResNetV2
+
+
+```python
+keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
+```
+
+Inception-ResNet V2 model, with weights pre-trained on ImageNet.
+
+This model is available for both the Theano and TensorFlow backend (but not CNTK), and can be built both
+with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).
+
+The default input size for this model is 299x299.
+
+
+### Arguments
+
+- include_top: whether to include the fully-connected layer at the top of the network.
+- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
+- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
+- input_shape: optional shape tuple, only to be specified
+    if `include_top` is `False` (otherwise the input shape
+    has to be `(299, 299, 3)` (with `'channels_last'` data format)
+    or `(3, 299, 299)` (with `'channels_first'` data format).
+    It should have exactly 3 inputs channels,
+    and width and height should be no smaller than 139.
+    E.g. `(150, 150, 3)` would be one valid value.
+- pooling: Optional pooling mode for feature extraction
+    when `include_top` is `False`.
+    - `None` means that the output of the model will be
+        the 4D tensor output of the
+        last convolutional layer.
+    - `'avg'` means that global average pooling
+        will be applied to the output of the
+        last convolutional layer, and thus
+        the output of the model will be a 2D tensor.
+    - `'max'` means that global max pooling will
+        be applied.
+- classes: optional number of classes to classify images 
+    into, only to be specified if `include_top` is `True`, and 
+    if no `weights` argument is specified.
+
+### Returns
+
+A Keras `Model` instance.
+
+### References
+
+- [Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning](https://arxiv.org/abs/1602.07261)
+
+### License
+
+These weights are released under [the Apache License](https://github.com/tensorflow/models/blob/master/LICENSE).
+
+-----
+
 ## MobileNet
 
 
@@ -492,9 +550,9 @@ The default input size for this model is 224x224.
 ### Arguments
 
 - input_shape: optional shape tuple, only to be specified
-    if `include_top` is False (otherwise the input shape
-    has to be `(224, 224, 3)` (with `channels_last` data format)
-    or (3, 224, 224) (with `channels_first` data format).
+    if `include_top` is `False` (otherwise the input shape
+    has to be `(224, 224, 3)` (with `'channels_last'` data format)
+    or (3, 224, 224) (with `'channels_first'` data format).
     It should have exactly 3 inputs channels,
     and width and height should be no smaller than 32.
     E.g. `(200, 200, 3)` would be one valid value.
@@ -511,7 +569,7 @@ The default input size for this model is 224x224.
 - include_top: whether to include the fully-connected
     layer at the top of the network.
 - weights: `None` (random initialization) or
-    `imagenet` (ImageNet weights)
+    `'imagenet'` (ImageNet weights)
 - input_tensor: optional Keras tensor (i.e. output of
     `layers.Input()`)
     to use as image input for the model.
@@ -520,20 +578,20 @@ The default input size for this model is 224x224.
     - `None` means that the output of the model
     will be the 4D tensor output of the
         last convolutional layer.
-    - `avg` means that global average pooling
+    - `'avg'` means that global average pooling
         will be applied to the output of the
         last convolutional layer, and thus
         the output of the model will be a
         2D tensor.
-    - `max` means that global max pooling will
+    - `'max'` means that global max pooling will
         be applied.
 - classes: optional number of classes to classify images
-    into, only to be specified if `include_top` is True, and
+    into, only to be specified if `include_top` is `True`, and
     if no `weights` argument is specified.
-    
+
 ### Returns
 
-A Keras model instance.
+A Keras `Model` instance.
 
 ### References
 

diff --git a/docs/templates/getting-started/faq.md b/docs/templates/getting-started/faq.md
@@ -418,6 +418,8 @@ Code and pre-trained weights are available for the following image classificatio
 - VGG19
 - ResNet50
 - Inception v3
+- Inception-ResNet v2
+- MobileNet v1
 
 They can be imported from the module `keras.applications`:
 
@@ -427,6 +429,8 @@ from keras.applications.vgg16 import VGG16
 from keras.applications.vgg19 import VGG19
 from keras.applications.resnet50 import ResNet50
 from keras.applications.inception_v3 import InceptionV3
+from keras.applications.inception_resnet_v2 import InceptionResNetV2
+from keras.applications.mobilenet import MobileNet
 
 model = VGG16(weights='imagenet', include_top=True)
 ```

diff --git a/keras/applications/__init__.py b/keras/applications/__init__.py
@@ -2,5 +2,6 @@
 from .vgg19 import VGG19
 from .resnet50 import ResNet50
 from .inception_v3 import InceptionV3
+from .inception_resnet_v2 import InceptionResNetV2
 from .xception import Xception
 from .mobilenet import MobileNet