Skip to content

Commit

Permalink
Add Inception-ResNet v2 to application (keras-team#7753)
Browse files Browse the repository at this point in the history
* Add Inception-ResNet v2 to application

* InceptionResNetV2 is not supported on CNTK (backend issues)

* Fewer layer names; remove dropout; update FAQ doc

* Docstrings; remove dropout_keep_prob in doc

* Format image_data_format as strings in doc; add test for channels_first
  • Loading branch information
myutwo150 authored and fchollet committed Sep 8, 2017
1 parent b76571f commit 19862b0
Show file tree
Hide file tree
Showing 5 changed files with 560 additions and 56 deletions.
170 changes: 114 additions & 56 deletions docs/templates/applications.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Weights are downloaded automatically when instantiating a model. They are stored
- [VGG19](#vgg19)
- [ResNet50](#resnet50)
- [InceptionV3](#inceptionv3)
- [InceptionResNetV2](#inceptionresnetv2)
- [MobileNet](#mobilenet)

All of these architectures (except Xception and MobileNet) are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image data format set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_data_format=channels_last`, then any model loaded from this repository will get built according to the TensorFlow data format convention, "Height-Width-Depth".
Expand Down Expand Up @@ -172,6 +173,7 @@ model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=T
| [VGG19](#vgg19) | 549 MB | 0.727 | 0.910 | 143,667,240 | 26
| [ResNet50](#resnet50) | 99 MB | 0.759 | 0.929 | 25,636,712 | 168
| [InceptionV3](#inceptionv3) | 92 MB | 0.788 | 0.944 | 23,851,784 | 159 |
| [InceptionResNetV2](#inceptionresnetv2) | 215 MB | 0.804 | 0.953 | 55,873,736 | 572 |
| [MobileNet](#mobilenet) | 17 MB | 0.665 | 0.871 | 4,253,864 | 88


Expand All @@ -194,17 +196,17 @@ and a top-5 validation accuracy of 0.945.

Note that this model is only available for the TensorFlow backend,
due to its reliance on `SeparableConvolution` layers. Additionally it only supports
the data format "channels_last" (height, width, channels).
the data format `'channels_last'` (height, width, channels).

The default input size for this model is 299x299.

### Arguments

- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
if `include_top` is `False` (otherwise the input shape
has to be `(299, 299, 3)`.
It should have exactly 3 inputs channels,
and width and height should be no smaller than 71.
Expand All @@ -214,19 +216,19 @@ The default input size for this model is 299x299.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras model instance.
A Keras `Model` instance.

### References

Expand All @@ -249,19 +251,19 @@ keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tenso
VGG16 model, with weights pre-trained on ImageNet.

This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).

The default input size for this model is 224x224.

### Arguments

- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
if `include_top` is `False` (otherwise the input shape
has to be `(224, 224, 3)` (with `'channels_last'` data format)
or `(3, 224, 224)` (with `'channels_first'` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
Expand All @@ -270,19 +272,19 @@ The default input size for this model is 224x224.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras model instance.
A Keras `Model` instance.

### References

Expand All @@ -305,19 +307,19 @@ keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tenso
VGG19 model, with weights pre-trained on ImageNet.

This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).

The default input size for this model is 224x224.

### Arguments

- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
if `include_top` is `False` (otherwise the input shape
has to be `(224, 224, 3)` (with `'channels_last'` data format)
or `(3, 224, 224)` (with `'channels_first'` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
Expand All @@ -326,19 +328,19 @@ The default input size for this model is 224x224.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras model instance.
A Keras `Model` instance.


### References
Expand All @@ -362,20 +364,20 @@ keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input
ResNet50 model, with weights pre-trained on ImageNet.

This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).

The default input size for this model is 224x224.


### Arguments

- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
if `include_top` is `False` (otherwise the input shape
has to be `(224, 224, 3)` (with `'channels_last'` data format)
or `(3, 224, 224)` (with `'channels_first'` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.
Expand All @@ -384,19 +386,19 @@ The default input size for this model is 224x224.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras model instance.
A Keras `Model` instance.

### References

Expand All @@ -418,20 +420,20 @@ keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet'
Inception V3 model, with weights pre-trained on ImageNet.

This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).

The default input size for this model is 299x299.


### Arguments

- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(299, 299, 3)` (with `channels_last` data format)
or `(3, 299, 299)` (with `channels_first` data format).
if `include_top` is `False` (otherwise the input shape
has to be `(299, 299, 3)` (with `'channels_last'` data format)
or `(3, 299, 299)` (with `'channels_first'` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 139.
E.g. `(150, 150, 3)` would be one valid value.
Expand All @@ -440,19 +442,19 @@ The default input size for this model is 299x299.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras model instance.
A Keras `Model` instance.

### References

Expand All @@ -464,6 +466,62 @@ These weights are released under [the Apache License](https://github.com/tensorf

-----

## InceptionResNetV2


```python
keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
```

Inception-ResNet V2 model, with weights pre-trained on ImageNet.

This model is available for both the Theano and TensorFlow backend (but not CNTK), and can be built both
with `'channels_first'` data format (channels, height, width) or `'channels_last'` data format (height, width, channels).

The default input size for this model is 299x299.


### Arguments

- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or `'imagenet'` (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is `False` (otherwise the input shape
has to be `(299, 299, 3)` (with `'channels_last'` data format)
or `(3, 299, 299)` (with `'channels_first'` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 139.
E.g. `(150, 150, 3)` would be one valid value.
- pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras `Model` instance.

### References

- [Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning](https://arxiv.org/abs/1602.07261)

### License

These weights are released under [the Apache License](https://github.com/tensorflow/models/blob/master/LICENSE).

-----

## MobileNet


Expand Down Expand Up @@ -492,9 +550,9 @@ The default input size for this model is 224x224.
### Arguments

- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or (3, 224, 224) (with `channels_first` data format).
if `include_top` is `False` (otherwise the input shape
has to be `(224, 224, 3)` (with `'channels_last'` data format)
or (3, 224, 224) (with `'channels_first'` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 32.
E.g. `(200, 200, 3)` would be one valid value.
Expand All @@ -511,7 +569,7 @@ The default input size for this model is 224x224.
- include_top: whether to include the fully-connected
layer at the top of the network.
- weights: `None` (random initialization) or
`imagenet` (ImageNet weights)
`'imagenet'` (ImageNet weights)
- input_tensor: optional Keras tensor (i.e. output of
`layers.Input()`)
to use as image input for the model.
Expand All @@ -520,20 +578,20 @@ The default input size for this model is 224x224.
- `None` means that the output of the model
will be the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
- `'avg'` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a
2D tensor.
- `max` means that global max pooling will
- `'max'` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
into, only to be specified if `include_top` is `True`, and
if no `weights` argument is specified.

### Returns

A Keras model instance.
A Keras `Model` instance.

### References

Expand Down
4 changes: 4 additions & 0 deletions docs/templates/getting-started/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,8 @@ Code and pre-trained weights are available for the following image classificatio
- VGG19
- ResNet50
- Inception v3
- Inception-ResNet v2
- MobileNet v1

They can be imported from the module `keras.applications`:

Expand All @@ -427,6 +429,8 @@ from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.resnet50 import ResNet50
from keras.applications.inception_v3 import InceptionV3
from keras.applications.inception_resnet_v2 import InceptionResNetV2
from keras.applications.mobilenet import MobileNet

model = VGG16(weights='imagenet', include_top=True)
```
Expand Down
1 change: 1 addition & 0 deletions keras/applications/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@
from .vgg19 import VGG19
from .resnet50 import ResNet50
from .inception_v3 import InceptionV3
from .inception_resnet_v2 import InceptionResNetV2
from .xception import Xception
from .mobilenet import MobileNet
Loading

0 comments on commit 19862b0

Please sign in to comment.