diff --git a/.Rbuildignore b/.Rbuildignore index e8db4ca60..00df434de 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -5,6 +5,7 @@ ^.*\.hdf5$ ^README.R?md$ ^docs$ +^website$ ^pkgdown$ ^dev$ ^runs$ diff --git a/website/LICENSE.html b/website/LICENSE.html new file mode 100644 index 000000000..2300cf017 --- /dev/null +++ b/website/LICENSE.html @@ -0,0 +1,152 @@ + + + + + + + + +License • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + +
+ +
+
+ + +
YEAR: 2017
+COPYRIGHT HOLDER: RStudio, Inc; Google, Inc; François Chollet; Yuan Tang
+
+ +
+ +
+ + + +
+ + + diff --git a/website/articles/about_keras_layers.html b/website/articles/about_keras_layers.html new file mode 100644 index 000000000..47147a272 --- /dev/null +++ b/website/articles/about_keras_layers.html @@ -0,0 +1,252 @@ + + + + + + + +About Keras Layers • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Overview

+

Keras layers are the fundamental building block of keras models. Layers are created using a wide variety of layer_ functions and are typically composed together by stacking calls to them using the pipe %>% operator. For example:

+
model <- keras_model_sequential() 
+model %>% 
+  layer_dense(units = 32, input_shape = c(784)) %>% 
+  layer_activation('relu') %>% 
+  layer_dense(units = 10) %>% 
+  layer_activation('softmax')
+

A wide variety of layers are available, including:

+ +
+
+

+Properties

+

All layers share the following properties:

+
    +
  • layer$name — String, must be unique within a model.

  • +
  • layer$input_spec — List of input specifications. Each entry describes one required input: (ndim, dtype). A layer with n input tensors must have an input_spec of length n.

  • +
  • layer$trainable — Boolean, whether the layer weights will be updated during training.

  • +
  • layer$uses_learning_phase – Whether any operation of the layer uses K.in_training_phase() or K.in_test_phase().

  • +
  • layer$input_shape — Input shape. Provided for convenience, but note that there may be cases in which this attribute is ill-defined (e.g. a shared layer with multiple input shapes), in which case requesting input_shape will result in an error. Prefer using get_input_shape_at(layer, node_index).

  • +
  • layer$output_shape — Output shape. See above.

  • +
  • layer$inbound_nodes — List of nodes.

  • +
  • layer$outbound_nodes — List of nodes.

  • +
  • layer$input, layer$output — Input/output tensor(s). Note that if the layer is used more than once (shared layer), this is ill-defined and will result in an error. In such cases, use get_input_at(layer, node_index).

  • +
  • layer$input_mask, layer$output_mask — Same as above, for masks.

  • +
  • layer$trainable_weights — List of variables.

  • +
  • layer$non_trainable_weights — List of variables.

  • +
  • layer$weights — The concatenation of the lists trainable_weights and non_trainable_weights (in this order).

  • +
  • layer$constraints — Mapping of weights to constraints.

  • +
+
+
+

+Functions

+

The following functions are available for interacting with layers:

+ ++++ + + + + + + + + + + + + + + + + + + + + + + +
+get_config() from_config() + +

+Layer/Model configuration +

+
+get_weights() set_weights() + +

+Layer/Model weights as R arrays +

+
+count_params() + +

+Count the total number of scalars composing the weights. +

+
+get_input_at() get_output_at() get_input_shape_at() get_output_shape_at() get_input_mask_at() get_output_mask_at() + +

+Retrieve tensors for layers with multiple nodes +

+
+reset_states() + +

+Reset the states for a layer +

+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/about_keras_models.html b/website/articles/about_keras_models.html new file mode 100644 index 000000000..adbf19d70 --- /dev/null +++ b/website/articles/about_keras_models.html @@ -0,0 +1,356 @@ + + + + + + + +About Keras Models • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Overview

+

There are two types of models available in Keras: sequential models and models created with the functional API.

+
+

+Sequential

+

Sequential models are created using the keras_model_sequential() function and are composed of a set of linear layers:

+
model <- keras_model_sequential() 
+model %>% 
+  layer_dense(units = 32, input_shape = c(784)) %>% 
+  layer_activation('relu') %>% 
+  layer_dense(units = 10) %>% 
+  layer_activation('softmax')
+

Note that Keras objects are modified in place which is why it’s not necessary for model to be assigned back to after the layers are added.

+

Learn more by reading the Guide to the Sequential Model.

+
+
+

+Functional

+

The functional API enables you to define more complex models, such as multi-output models, directed acyclic graphs, or models with shared layers. To create a model with the functional API compose a set of input and output layers then pass them to the keras_model() function:

+
tweet_a <- layer_input(shape = c(140, 256))
+tweet_b <- layer_input(shape = c(140, 256))
+
+# This layer can take as input a matrix and will return a vector of size 64
+shared_lstm <- layer_lstm(units = 64)
+
+# When we reuse the same layer instance multiple times, the weights of the layer are also
+# being reused (it is effectively *the same* layer)
+encoded_a <- tweet_a %>% shared_lstm
+encoded_b <- tweet_b %>% shared_lstm
+
+# We can then concatenate the two vectors and add a logistic regression on top
+predictions <- layer_concatenate(c(encoded_a, encoded_b), axis=-1) %>% 
+  layer_dense(units = 1, activation = 'sigmoid')
+
+# We define a trainable model linking the tweet inputs to the predictions
+model <- keras_model(inputs = c(tweet_a, tweet_b), outputs = predictions)
+

Learn more by reading the Guide to the Functional API.

+
+
+
+

+Properites

+

All models share the following properties:

+
    +
  • model$layers — A flattened list of the layers comprising the model graph.

  • +
  • model$inputs — List of input tensors.

  • +
  • model$outputs — List of output tensors.

  • +
+
+
+

+Functions

+

These functions enable you to create, train, evaluate, persist, and generate predictions with models:

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+keras_model() + +

+Keras Model +

+
+keras_model_sequential() + +

+Keras Model composed of a linear stack of layers +

+
+compile() + +

+Configure a Keras model for training +

+
+fit() + +

+Train a Keras model +

+
+evaluate() + +

+Evaluate a Keras model +

+
+predict() + +

+Predict Method for Keras Models +

+
+summary() + +

+Print a summary of a model +

+
+save_model_hdf5() load_model_hdf5() + +

+Save/Load models using HDF5 files +

+
+get_layer() + +

+Retrieves a layer based on either its name (unique) or index. +

+
+pop_layer() + +

+Remove the last layer in a model +

+
+save_model_weights_hdf5() load_model_weights_hdf5() + +

+Save/Load model weights using HDF5 files +

+
+get_weights() set_weights() + +

+Layer/Model weights as R arrays +

+
+get_config() from_config() + +

+Layer/Model configuration +

+
+model_to_json() model_from_json() + +

+Model configuration as JSON +

+
+model_to_yaml() model_from_yaml() + +

+Model configuration as YAML +

+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/applications.html b/website/articles/applications.html new file mode 100644 index 000000000..10610824c --- /dev/null +++ b/website/articles/applications.html @@ -0,0 +1,281 @@ + + + + + + + +Using Pre-Trained Models • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Applications

+

Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be used for prediction, feature extraction, and fine-tuning.

+

Weights are downloaded automatically when instantiating a model. They are stored at ~/.keras/models/.

+

The following image classification models (with weights trained on ImageNet) are available:

+ +
+
+

+Usage Examples

+
+

+Classify ImageNet classes with ResNet50

+
# instantiate the model
+model <- application_resnet50(weights = 'imagenet')
+
+# load the image
+img_path <- "elephant.jpg"
+img <- image_load(img_path, target_size = c(224,224))
+x <- image_to_array(img)
+
+# ensure we have a 4d tensor with single element in the batch dimension,
+# the preprocess the input for prediction using resnet50
+dim(x) <- c(1, dim(x))
+x <- imagenet_preprocess_input(x)
+
+# make predictions then decode and print them
+preds <- model %>% predict(x)
+imagenet_decode_predictions(preds, top = 3)[[1]]
+
  class_name class_description      score
+1  n02504013   Indian_elephant 0.90117526
+2  n01871265            tusker 0.08774310
+3  n02504458  African_elephant 0.01046011
+
+
+

+Extract features with VGG16

+
model <- application_vgg16(weights = 'imagenet', include_top = FALSE)
+
+img_path <- "elephant.jpg"
+img <- image_load(img_path, target_size = c(224,224))
+x <- image_to_array(img)
+dim(x) <- c(1, dim(x))
+x <- imagenet_preprocess_input(x)
+
+features <- model %>% predict(x)
+
+
+

+Extract features from an arbitrary intermediate layer with VGG19

+
base_model <- application_vgg19(weights = 'imagenet')
+model <- keras_model(inputs = base_model$input, 
+                     outputs = get_layer(base_model, 'block4_pool')$output)
+
+img_path <- "elephant.jpg"
+img <- image_load(img_path, target_size = c(224,224))
+x <- image_to_array(img)
+dim(x) <- c(1, dim(x)) 
+x <- imagenet_preprocess_input(x)
+
+block4_pool_features <- model %>% predict(x)
+
+
+

+Fine-tune InceptionV3 on a new set of classes

+
# create the base pre-trained model
+base_model <- application_inception_v3(weights = 'imagenet', include_top = FALSE)
+
+# add our custom layers
+predictions <- base_model$output %>% 
+  layer_global_average_pooling_2d() %>% 
+  layer_dense(units = 1024, activation = 'relu') %>% 
+  layer_dense(units = 200, activation = 'softmax')
+
+# this is the model we will train
+model <- keras_model(inputs = base_model$input, outputs = predictions)
+
+# first: train only the top layers (which were randomly initialized)
+# i.e. freeze all convolutional InceptionV3 layers
+for (layer in base_model$layers)
+  layer$trainable <- FALSE
+
+# compile the model (should be done *after* setting layers to non-trainable)
+model %>% compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy')
+
+# train the model on the new data for a few epochs
+model %>% fit_generator(...)
+
+# at this point, the top layers are well trained and we can start fine-tuning
+# convolutional layers from inception V3. We will freeze the bottom N layers
+# and train the remaining top layers.
+
+# let's visualize layer names and layer indices to see how many layers
+# we should freeze:
+layers <- base_model$layers
+for (i in 1:length(layers))
+  cat(i, layers[[i]]$name, "\n")
+
+# we chose to train the top 2 inception blocks, i.e. we will freeze
+# the first 172 layers and unfreeze the rest:
+for (i in 1:172)
+  layers[[i]]$trainable <- FALSE
+for (i in 173:length(layers))
+  layers[[i]]$trainable <- TRUE
+
+# we need to recompile the model for these modifications to take effect
+# we use SGD with a low learning rate
+model %>% compile(
+  optimizer = optimizer_sgd(lr = 0.0001, momentum = 0.9), 
+  loss = 'categorical_crossentropy'
+)
+
+# we train our model again (this time fine-tuning the top 2 inception blocks
+# alongside the top Dense layers
+model %>% fit_generator(...)
+
+
+

+Build InceptionV3 over a custom input tensor

+
# this could also be the output a different Keras model or layer
+input_tensor <- layer_input(shape = c(224, 224, 3))
+
+model <- application_inception_V3(input_tensor = input_tensor, 
+                                  weights='imagenet', 
+                                  include_top = TRUE)
+
+
+

+Additional examples

+

The VGG16 model is the basis for the Deep dream Keras example script.

+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/backend.html b/website/articles/backend.html new file mode 100644 index 000000000..b0a68b824 --- /dev/null +++ b/website/articles/backend.html @@ -0,0 +1,809 @@ + + + + + + + +Keras Backend • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Overview

+

Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the “backend engine” of Keras.

+

The R interface to Keras uses TensorFlow™ as it’s default tensor backend engine, however it’s possible to use other backends if desired. At this time, Keras has three backend implementations available:

+
    +
  • TensorFlow is an open-source symbolic tensor manipulation framework developed by Google, Inc.

  • +
  • Theano is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.

  • +
  • CNTK is an open-source, commercial-grade toolkit for deep learning developed by Microsoft.

  • +
+
+
+

+Selecting a Backend

+

Keras uses the TensorFlow backend by default. If you want to switch to Theano set the KERAS_BACKEND environment variable before loading the Keras package as follows:

+
Sys.setenv(KERAS_BACKEND = "theano")
+library(keras)
+

If you want to use the CNTK backend then you should follow the installation instructions for CNTK and then set the KERAS_BACKEND environment variable before loading the keras R package as follows:

+
Sys.setenv(KERAS_BACKEND = "cntk")
+library(keras)
+
+

+Environment Variables

+

If you want to use a backend provided by the keras Python package you typically need only to install the package and the backend, then set the KERAS_BACKEND environment variable as described above.

+

If you need to customize things further there are several environment variables that affect the version of Keras used:

+ ++++ + + + + + + + + + + + + + + + + + + +
VariableDescription
KERAS_IMPLEMENTATIONKeras specifies an API that can be implemented by multiple providers. By default, the Keras R package uses the implementation provided by the Keras Python package (“keras”). TensorFlow also provides an integrated implementation of Keras which you can use by specifying “tensorflow” as the implementation.
KERAS_BACKENDThe “keras” implementation supports the “tensorflow”, “keras”, and “cntk” backends. Note that the “tensorflow” implementation supports only the “tensorflow” backend.
KERAS_PYTHONThe Keras R package will automatically scan installed versions of Python (and virtual/conda environments) to find the one that includes the selected implementation of Keras. If this scanning doesn’t find the right version or you want to override its behavior, you can set the KERAS_PYTHON environment variable to the location of the Python binary you want to use.
+

Note that if you want to use TensorFlow as the backend engine you wouldn’t need to set any of these variables, as it will be used automatically by default.

+
+
+
+

+Accessing the Backend in Code

+

If you want the Keras modules you write to be compatible with all available backends, you have to write them via the abstract Keras backend API. You can obtain a reference to the TensorFlow backend by calling the backend() function:

+
library(keras)
+K <- backend()
+

The code below instantiates an input placeholder. It’s equivalent to tf$placeholder():

+
input <- K$placeholder(shape = list(2L, 4L, 5L))
+# also works:
+input <-  K$placeholder(shape = list(NULL, 4L, 5L))
+# also works:
+input <- K$placeholder(ndim = 3L)
+

The code below instantiates a shared variable. It’s equivalent to tf$Variable():

+
val <- array(runif(60), dim = c(3L, 4L, 5L))
+var <- K$variable(value = val)
+
+# all-zeros variable:
+var <- K$zeros(shape = list(3L, 4L, 5L))
+# all-ones:
+var <- K$ones(shape = list(3L, 4L, 5L))
+

Note that the examples above all pass integer values explicitly (e.g. 5L). This is because unlike the high level R functions in the Keras package the backend APIs are all strongly typed (i.e. float values are not automatically converted to integers).

+
+
+

+Backend Functions

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameDescription
absElement-wise absolute value.
allBitwise reduction (logical AND).
anyBitwise reduction (logical OR).
arangeCreates a 1D tensor containing a sequence of integers.
argmaxReturns the index of the maximum value along an axis.
argminReturns the index of the minimum value along an axis.
backendPublicly accessible method for determining the current backend.
batch_dotBatchwise dot product.
batch_flattenTurn a nD tensor into a 2D tensor with same 0th dimension.
batch_get_valueReturns the value of more than one tensor variable.
batch_normalizationApplies batch normalization on x given mean, var, beta and gamma.
batch_set_valueSets the values of many tensor variables at once.
bias_addAdds a bias vector to a tensor.
binary_crossentropyBinary crossentropy between an output tensor and a target tensor.
castCasts a tensor to a different dtype and returns it.
cast_to_floatxCast a Numpy array to the default Keras float type.
categorical_crossentropyCategorical crossentropy between an output tensor and a target tensor.
clear_sessionDestroys the current TF graph and creates a new one.
clipElement-wise value clipping.
concatenateConcatenates a list of tensors alongside the specified axis.
constantCreates a constant tensor.
conv1d1D convolution.
conv2d2D convolution.
conv2d_transpose2D deconvolution (i.e.
conv3d3D convolution.
cosComputes cos of x element-wise.
count_paramsReturns the number of scalars in a Keras variable.
ctc_batch_costRuns CTC loss algorithm on each batch element.
ctc_decodeDecodes the output of a softmax.
ctc_label_dense_to_sparseConverts CTC labels from dense to sparse.
cumprodCumulative product of the values in a tensor, alongside the specified axis.
cumsumCumulative sum of the values in a tensor, alongside the specified axis.
dotMultiplies 2 tensors (and/or variables) and returns a tensor.
dropoutSets entries in x to zero at random, while scaling the entire tensor.
dtypeReturns the dtype of a Keras tensor or variable, as a string.
eluExponential linear unit.
epsilonReturns the value of the fuzz factor used in numeric expressions.
equalElement-wise equality between two tensors.
evalEvaluates the value of a variable.
expElement-wise exponential.
expand_dimsAdds a 1-sized dimension at index “axis”.
eyeInstantiate an identity matrix and returns it.
flattenFlatten a tensor.
floatxReturns the default float type, as a string.
foldlReduce elems using fn to combine them from left to right.
foldrReduce elems using fn to combine them from right to left.
gatherRetrieves the elements of indices indices in the tensor reference.
get_sessionReturns the TF session to be used by the backend.
get_uidAssociates a string prefix with an integer counter in a TensorFlow graph.
get_valueReturns the value of a variable.
gradientsReturns the gradients of variables w.r.t. loss.
greaterElement-wise truth value of (x > y).
greater_equalElement-wise truth value of (x >= y).
hard_sigmoidSegment-wise linear approximation of sigmoid.
identityReturns a tensor with the same content as the input tensor.
image_data_formatReturns the default image data format convention.
in_test_phaseSelects x in test phase, and alt otherwise.
in_top_kReturns whether the targets are in the top k predictions.
in_train_phaseSelects x in train phase, and alt otherwise.
int_shapeReturns the shape tensor or variable as a list of int or NULL entries.
is_sparseReturns whether a tensor is a sparse tensor.
l2_normalizeNormalizes a tensor wrt the L2 norm alongside the specified axis.
learning_phaseReturns the learning phase flag.
lessElement-wise truth value of (x < y).
less_equalElement-wise truth value of (x <= y).
local_conv1dApply 1D conv with un-shared weights.
local_conv2dApply 2D conv with un-shared weights.
logElement-wise log.
logsumexpComputes log(sum(exp(elements across dimensions of a tensor))).
manual_variable_initializationSets the manual variable initialization flag.
map_fnMap the function fn over the elements elems and return the outputs.
maxMaximum value in a tensor.
maximumElement-wise maximum of two tensors.
meanMean of a tensor, alongside the specified axis.
minMinimum value in a tensor.
minimumElement-wise minimum of two tensors.
moving_average_updateCompute the moving average of a variable.
name_scopeReturns a context manager for use when defining a Python op.
ndimReturns the number of axes in a tensor, as an integer.
normalize_batch_in_trainingComputes mean and std for batch then apply batch_normalization on batch.
not_equalElement-wise inequality between two tensors.
one_hotComputes the one-hot representation of an integer tensor.
onesInstantiates an all-ones tensor variable and returns it.
ones_likeInstantiates an all-ones variable of the same shape as another tensor.
permute_dimensionsPermutes axes in a tensor.
placeholderInstantiates a placeholder tensor and returns it.
pool2d2D Pooling.
pool3d3D Pooling.
powElement-wise exponentiation.
print_tensorPrints message and the tensor value when evaluated.
prodMultiplies the values in a tensor, alongside the specified axis.
py_allall(iterable) -> bool
py_sumsum(sequence[, start]) -> value
random_binomialReturns a tensor with random binomial distribution of values.
random_normalReturns a tensor with normal distribution of values.
random_normal_variableInstantiates a variable with values drawn from a normal distribution.
random_uniformReturns a tensor with uniform distribution of values.
random_uniform_variableInstantiates a variable with values drawn from a uniform distribution.
reluRectified linear unit.
repeat_elementsRepeats the elements of a tensor along an axis, like np.repeat.
reset_uids
reshapeReshapes a tensor to the specified shape.
resize_imagesResizes the images contained in a 4D tensor.
resize_volumesResizes the volume contained in a 5D tensor.
reverseReverse a tensor along the specified axes.
rnnIterates over the time dimension of a tensor.
roundElement-wise rounding to the closest integer.
separable_conv2d2D convolution with separable filters.
set_epsilonSets the value of the fuzz factor used in numeric expressions.
set_floatxSets the default float type.
set_image_data_formatSets the value of the image data format convention.
set_learning_phaseSets the learning phase to a fixed value.
set_sessionSets the global TensorFlow session.
set_valueSets the value of a variable, from a Numpy array.
shapeReturns the symbolic shape of a tensor or variable.
sigmoidElement-wise sigmoid.
signElement-wise sign.
sinComputes sin of x element-wise.
softmaxSoftmax of a tensor.
softplusSoftplus of a tensor.
softsignSoftsign of a tensor.
sparse_categorical_crossentropyCategorical crossentropy with integer targets.
spatial_2d_paddingPads the 2nd and 3rd dimensions of a 4D tensor.
spatial_3d_paddingPads 5D tensor with zeros along the depth, height, width dimensions.
sqrtElement-wise square root.
squareElement-wise square.
squeezeRemoves a 1-dimension from the tensor at index “axis”.
stackStacks a list of rank R tensors into a rank R+1 tensor.
stdStandard deviation of a tensor, alongside the specified axis.
stop_gradientReturns variables but with zero gradient w.r.t. every other variable.
sumSum of the values in a tensor, alongside the specified axis.
switchSwitches between two operations depending on a scalar value.
tanhElement-wise tanh.
temporal_paddingPads the middle dimension of a 3D tensor.
tileCreates a tensor by tiling x by n.
to_denseConverts a sparse tensor into a dense tensor and returns it.
transposeTransposes a tensor and returns it.
truncated_normalReturns a tensor with truncated random normal distribution of values.
update
update_addUpdate the value of x by adding increment.
update_subUpdate the value of x by subtracting decrement.
varVariance of a tensor, alongside the specified axis.
variableInstantiates a variable and returns it.
zerosInstantiates an all-zeros variable and returns it.
zeros_likeInstantiates an all-zeros variable of the same shape as another tensor.
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/checkpoints.h5 b/website/articles/checkpoints.h5 new file mode 100644 index 000000000..cc23b2163 Binary files /dev/null and b/website/articles/checkpoints.h5 differ diff --git a/website/articles/custom_layers.html b/website/articles/custom_layers.html new file mode 100644 index 000000000..0fe5d64a5 --- /dev/null +++ b/website/articles/custom_layers.html @@ -0,0 +1,219 @@ + + + + + + + +Writing Custom Keras Layers • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+

If the existing Keras layers don’t meet your requirements you can create a custom layer. For simple, stateless custom operations, you are probably better off using layer_lambda() layers. But for any custom operation that has trainable weights, you should implement your own layer.

+

The example below illustrates the skeleton of a Keras custom layer. The mnist_antirectifier example includes another demonstration of creating a custom layer.

+
+

+KerasLayer R6 Class

+

To create a custom Keras layer, you create an R6 class derived from KerasLayer. There are three methods to implement (only one of which, call(), is required for all types of layer):

+
    +
  • +build(input_shape): This is where you will define your weights. Note that if your layer doesn’t define trainable weights then you need not implemented this method.
  • +
  • +call(x): This is where the layer’s logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to call: the input tensor.
  • +
  • +compute_output_shape(input_shape): In case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference. If you don’t modify the shape of the input then you need not implement this method.
  • +
+
library(keras)
+
+K <- backend()
+
+CustomLayer <- R6::R6Class("KerasLayer",
+                                  
+  inherit = KerasLayer,
+  
+  public = list(
+    
+    output_dim = NULL,
+    
+    kernel = NULL,
+    
+    initialize = function(output_dim) {
+      self$output_dim <- output_dim
+    },
+    
+    build = function(input_shape) {
+      self$kernel <- self$add_weight(
+        name = 'kernel', 
+        shape = list(input_shape[[2]], self$output_dim),
+        initializer = initializer_random_normal(),
+        trainable = TRUE
+      )
+    },
+    
+    call = function(x, mask = NULL) {
+      K$dot(x, self$kernel)
+    },
+    
+    compute_output_shape = function(input_shape) {
+      list(input_shape[[1]], self$output_dim)
+    }
+  )
+)
+

Note that tensor operations are executed using the Keras backend(). See the Keras Backend article for details on the various functions available from Keras backends.

+
+
+

+Layer Wrapper Function

+

In order to use the custom layer within a Keras model you also need to create a wrapper function which instantiates the layer using the create_layer() function. For example:

+
# define layer wrapper function
+layer_custom <- function(object, output_dim, name = NULL, trainable = TRUE) {
+  create_layer(CustomLayer, object, list(
+    output_dim = as.integer(output_dim),
+    name = name,
+    trainable = trainable
+  ))
+}
+
+# use it in a model
+model <- keras_model_sequential()
+model %>% 
+  layer_dense(units = 32, input_shape = c(32,32)) %>% 
+  layer_custom(output_dim = 32)
+

Some important things to note about the layer wrapper function:

+
    +
  1. It accepts object as its first parameter (the object will either be a Keras sequential model or another Keras layer). The object parameter enables the layer to be composed with other layers using the magrittr pipe (%>%) operator.

  2. +
  3. It converts it’s output_dim to integer using the as.integer() function. This is done as convenience to the user because Keras variables are strongly typed (you can’t pass a float if an integer is expected). This enables users of the function to write output_dim = 32 rather than output_dim = 32L.

  4. +
  5. Some additional parameters not used by the layer (name and trainable) are in the function signature. Custom layer functions can include any of the core layer function arguments (input_shape, batch_input_shape, batch_size, dtype, name, trainable, and weights) and they will be automatically forwarded to the Layer base class.

  6. +
+

See the mnist_antirectifier example for another demonstration of creating a custom layer.

+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/addition_rnn.R b/website/articles/examples/addition_rnn.R new file mode 100644 index 000000000..cc25eaa85 --- /dev/null +++ b/website/articles/examples/addition_rnn.R @@ -0,0 +1,210 @@ +#' An implementation of sequence to sequence learning for performing addition +#' +#' Input: "535+61" +#' Output: "596" +#' +#' Padding is handled by using a repeated sentinel character (space) +#' +#' Input may optionally be inverted, shown to increase performance in many tasks in: +#' "Learning to Execute" +#' http://arxiv.org/abs/1410.4615 +#' and +#' "Sequence to Sequence Learning with Neural Networks" +#' http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf +#' Theoretically it introduces shorter term dependencies between source and target. +#' +#' Two digits inverted: +#' One layer LSTM (128 HN), 5k training examples = 99% train/test accuracy in 55 epochs +#' +#' Three digits inverted: +#' One layer LSTM (128 HN), 50k training examples = 99% train/test accuracy in 100 epochs +#' +#' Four digits inverted: +#' One layer LSTM (128 HN), 400k training examples = 99% train/test accuracy in 20 epochs +#' +#' Five digits inverted: +#' One layer LSTM (128 HN), 550k training examples = 99% train/test accuracy in 30 epochs +#' + +library(keras) +library(stringi) + +# Function Definitions ---------------------------------------------------- + +# Creates the char table +# Just sorts them.. +learn_encoding <- function(chars){ + sort(chars) +} + +# Encode to a character sequence to a one hot +# integer representation. +# > encode("22+22", char_table) +# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] +# 2 0 0 0 0 1 0 0 0 0 0 0 0 +# 2 0 0 0 0 1 0 0 0 0 0 0 0 +# + 0 1 0 0 0 0 0 0 0 0 0 0 +# 2 0 0 0 0 1 0 0 0 0 0 0 0 +# 2 0 0 0 0 1 0 0 0 0 0 0 0 +encode <- function(char, char_table){ + strsplit(char, "") %>% + unlist() %>% + sapply(function(x){ + as.numeric(x == char_table) + }) %>% + t() +} + +# Decode the one hot representation/probabilities representation +# to their character output. +decode <- function(x, char_table){ + apply(x,1, function(y){ + char_table[which.max(y)] + }) %>% paste0(collapse = "") +} + +# Returns a list of questions and expected answers. +generate_data <- function(size, digits, invert = TRUE){ + + max_num <- as.integer(paste0(rep(9, digits), collapse = "")) + + # generate integers for both sides of question + x <- sample(1:max_num, size = size, replace = TRUE) + y <- sample(1:max_num, size = size, replace = TRUE) + + # make left side always samalller then right side + left_side <- ifelse(x <= y, x, y) + right_side <- ifelse(x >= y, x, y) + + results <- left_side + right_side + + # pad with spaces on the right + questions <- paste0(left_side, "+", right_side) + questions <- stri_pad(questions, width = 2*digits+1, + side = "right", pad = " ") + if(invert){ + questions <- stri_reverse(questions) + } + # pad with spaces on the left + results <- stri_pad(results, width = digits + 1, + side = "left", pad = " ") + + list( + questions = questions, + results = results + ) +} + +# Parameters -------------------------------------------------------------- + +# Parameters for the model and dataset. +TRAINING_SIZE <- 50000 +DIGITS <- 2 + +# Maximum length of input is 'int + int' (e.g., '345+678'). Maximum length of +# int is DIGITS. +MAXLEN <- DIGITS + 1 + DIGITS + +# All the numbers, plus sign and space for padding. +charset <- c(0:9, "+", " ") +char_table <- learn_encoding(charset) + + +# Data Preparation -------------------------------------------------------- + +# Generate Data + +examples <- generate_data(size = TRAINING_SIZE, digits = DIGITS) + +# Vectorization + +x <- array(0, dim = c(length(examples$questions), MAXLEN, length(char_table))) +y <- array(0, dim = c(length(examples$questions), DIGITS + 1, length(char_table))) + +for(i in 1:TRAINING_SIZE){ + x[i,,] <- encode(examples$questions[i], char_table) + y[i,,] <- encode(examples$results[i], char_table) +} + +# Shuffle + +indices <- sample(1:TRAINING_SIZE, size = TRAINING_SIZE) +x <- x[indices,,] +y <- y[indices,,] + + +# Explicitly set apart 10% for validation data that we never train over. + +split_at <- trunc(TRAINING_SIZE/10) +x_val <- x[1:split_at,,] +y_val <- y[1:split_at,,] +x_train <- x[(split_at + 1):TRAINING_SIZE,,] +y_train <- y[(split_at + 1):TRAINING_SIZE,,] + +print('Training Data:') +print(dim(x_train)) +print(dim(y_train)) + +print('Validation Data:') +print(dim(x_val)) +print(dim(y_val)) + + +# Training ---------------------------------------------------------------- + +HIDDEN_SIZE <- 128 +BATCH_SIZE <- 128 +LAYERS <- 1 + +# Initialize sequential model +model <- keras_model_sequential() + +model %>% + # "Encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE. + # Note: In a situation where your input sequences have a variable length, + # use input_shape=(None, num_feature). + layer_lstm(HIDDEN_SIZE, input_shape=c(MAXLEN, length(char_table))) %>% + # As the decoder RNN's input, repeatedly provide with the last hidden state of + # RNN for each time step. Repeat 'DIGITS + 1' times as that's the maximum + # length of output, e.g., when DIGITS=3, max output is 999+999=1998. + layer_repeat_vector(DIGITS + 1) + +# The decoder RNN could be multiple layers stacked or a single layer. +# By setting return_sequences to True, return not only the last output but +# all the outputs so far in the form of (num_samples, timesteps, +# output_dim). This is necessary as TimeDistributed in the below expects +# the first dimension to be the timesteps. +for(i in 1:LAYERS) + model %>% layer_lstm(HIDDEN_SIZE, return_sequences = TRUE) + +model %>% + # Apply a dense layer to the every temporal slice of an input. For each of step + # of the output sequence, decide which character should be chosen. + time_distributed(layer_dense(units = length(char_table))) %>% + layer_activation("softmax") + +# Compiling the model +model %>% compile( + loss = "categorical_crossentropy", + optimizer = "adam", + metrics = "accuracy" +) + +# Get the model summary +summary(model) + +# Fitting loop +model %>% fit( + x = x_train, + y = y_train, + batch_size = BATCH_SIZE, + epochs = 70, + validation_data = list(x_val, y_val) +) + +# Predict for a new obs +new_obs <- encode("55+22", char_table) %>% + array(dim = c(1,5,12)) +result <- predict(model, new_obs) +result <- result[1,,] +decode(result, char_table) diff --git a/website/articles/examples/addition_rnn.html b/website/articles/examples/addition_rnn.html new file mode 100644 index 000000000..d58142d20 --- /dev/null +++ b/website/articles/examples/addition_rnn.html @@ -0,0 +1,327 @@ + + + + + + + +addition_rnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

An implementation of sequence to sequence learning for performing addition

+

Input: “535+61”
+Output: “596”

+

Padding is handled by using a repeated sentinel character (space)

+

Input may optionally be inverted, shown to increase performance in many tasks in: “Learning to Execute” http://arxiv.org/abs/1410.4615 and “Sequence to Sequence Learning with Neural Networks” http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Theoretically it introduces shorter term dependencies between source and target.

+

Two digits inverted: One layer LSTM (128 HN), 5k training examples = 99% train/test accuracy in 55 epochs

+

Three digits inverted: One layer LSTM (128 HN), 50k training examples = 99% train/test accuracy in 100 epochs

+

Four digits inverted: One layer LSTM (128 HN), 400k training examples = 99% train/test accuracy in 20 epochs

+

Five digits inverted: One layer LSTM (128 HN), 550k training examples = 99% train/test accuracy in 30 epochs

+
library(keras)
+library(stringi)
+
+# Function Definitions ----------------------------------------------------
+
+# Creates the char table
+# Just sorts them..
+learn_encoding <- function(chars){
+  sort(chars)
+}
+
+# Encode to a character sequence to a one hot
+# integer representation. 
+# > encode("22+22", char_table)
+# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
+# 2    0    0    0    0    1    0    0    0    0     0     0     0
+# 2    0    0    0    0    1    0    0    0    0     0     0     0
+# +    0    1    0    0    0    0    0    0    0     0     0     0
+# 2    0    0    0    0    1    0    0    0    0     0     0     0
+# 2    0    0    0    0    1    0    0    0    0     0     0     0
+encode <- function(char, char_table){
+  strsplit(char, "") %>%
+    unlist() %>%
+    sapply(function(x){
+      as.numeric(x == char_table)
+    }) %>% 
+    t()
+}
+
+# Decode the one hot representation/probabilities representation
+# to their character output.
+decode <- function(x, char_table){
+  apply(x,1, function(y){
+    char_table[which.max(y)]
+  }) %>% paste0(collapse = "")
+}
+
+# Returns a list of questions and expected answers.
+generate_data <- function(size, digits, invert = TRUE){
+  
+  max_num <- as.integer(paste0(rep(9, digits), collapse = ""))
+  
+  # generate integers for both sides of question
+  x <- sample(1:max_num, size = size, replace = TRUE)
+  y <- sample(1:max_num, size = size, replace = TRUE)
+  
+  # make left side always samalller then right side
+  left_side <- ifelse(x <= y, x, y)
+  right_side <- ifelse(x >= y, x, y)
+  
+  results <- left_side + right_side
+  
+  # pad with spaces on the right
+  questions <- paste0(left_side, "+", right_side)
+  questions <- stri_pad(questions, width = 2*digits+1, 
+                        side = "right", pad = " ")
+  if(invert){
+    questions <- stri_reverse(questions)
+  }
+  # pad with spaces on the left
+  results <- stri_pad(results, width = digits + 1, 
+                      side = "left", pad = " ")
+  
+  list(
+    questions = questions,
+    results = results
+  )
+}
+
+# Parameters --------------------------------------------------------------
+
+# Parameters for the model and dataset.
+TRAINING_SIZE <- 50000
+DIGITS <- 2
+
+# Maximum length of input is 'int + int' (e.g., '345+678'). Maximum length of
+# int is DIGITS.
+MAXLEN <- DIGITS + 1 + DIGITS
+
+# All the numbers, plus sign and space for padding.
+charset <- c(0:9, "+", " ")
+char_table <- learn_encoding(charset)
+
+
+# Data Preparation --------------------------------------------------------
+
+# Generate Data
+
+examples <- generate_data(size = TRAINING_SIZE, digits = DIGITS)
+
+# Vectorization
+
+x <- array(0, dim = c(length(examples$questions), MAXLEN, length(char_table)))
+y <- array(0, dim = c(length(examples$questions), DIGITS + 1, length(char_table)))
+
+for(i in 1:TRAINING_SIZE){
+  x[i,,] <- encode(examples$questions[i], char_table)
+  y[i,,] <- encode(examples$results[i], char_table)
+}
+
+# Shuffle
+
+indices <- sample(1:TRAINING_SIZE, size = TRAINING_SIZE)
+x <- x[indices,,]
+y <- y[indices,,]
+
+
+# Explicitly set apart 10% for validation data that we never train over.
+
+split_at <- trunc(TRAINING_SIZE/10)
+x_val <- x[1:split_at,,]
+y_val <- y[1:split_at,,]
+x_train <- x[(split_at + 1):TRAINING_SIZE,,]
+y_train <- y[(split_at + 1):TRAINING_SIZE,,]
+
+print('Training Data:')
+print(dim(x_train))
+print(dim(y_train))
+
+print('Validation Data:')
+print(dim(x_val))
+print(dim(y_val))
+
+
+# Training ----------------------------------------------------------------
+
+HIDDEN_SIZE <- 128
+BATCH_SIZE <- 128
+LAYERS <- 1
+
+# Initialize sequential model
+model <- keras_model_sequential() 
+
+model %>%
+  # "Encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE.
+  # Note: In a situation where your input sequences have a variable length,
+  # use input_shape=(None, num_feature).
+  layer_lstm(HIDDEN_SIZE, input_shape=c(MAXLEN, length(char_table))) %>%
+  # As the decoder RNN's input, repeatedly provide with the last hidden state of
+  # RNN for each time step. Repeat 'DIGITS + 1' times as that's the maximum
+  # length of output, e.g., when DIGITS=3, max output is 999+999=1998.
+  layer_repeat_vector(DIGITS + 1)
+
+# The decoder RNN could be multiple layers stacked or a single layer.
+# By setting return_sequences to True, return not only the last output but
+# all the outputs so far in the form of (num_samples, timesteps,
+# output_dim). This is necessary as TimeDistributed in the below expects
+# the first dimension to be the timesteps.
+for(i in 1:LAYERS)
+  model %>% layer_lstm(HIDDEN_SIZE, return_sequences = TRUE)
+
+model %>% 
+  # Apply a dense layer to the every temporal slice of an input. For each of step
+  # of the output sequence, decide which character should be chosen.
+  time_distributed(layer_dense(units = length(char_table))) %>%
+  layer_activation("softmax")
+
+# Compiling the model
+model %>% compile(
+  loss = "categorical_crossentropy", 
+  optimizer = "adam", 
+  metrics = "accuracy"
+)
+
+# Get the model summary
+summary(model)
+
+# Fitting loop
+model %>% fit( 
+  x = x_train, 
+  y = y_train, 
+  batch_size = BATCH_SIZE, 
+  epochs = 70,
+  validation_data = list(x_val, y_val)
+)
+
+# Predict for a new obs
+new_obs <- encode("55+22", char_table) %>%
+  array(dim = c(1,5,12))
+result <- predict(model, new_obs)
+result <- result[1,,]
+decode(result, char_table)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/babi_memnn.R b/website/articles/examples/babi_memnn.R new file mode 100644 index 000000000..255b94f25 --- /dev/null +++ b/website/articles/examples/babi_memnn.R @@ -0,0 +1,229 @@ +#' Trains a memory network on the bAbI dataset. +#' +#' References: +#' +#' - Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov, Alexander M. Rush, +#' "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks", +#' http://arxiv.org/abs/1502.05698 +#' +#' - Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus, +#' "End-To-End Memory Networks", http://arxiv.org/abs/1503.08895 +#' +#' Reaches 98.6% accuracy on task 'single_supporting_fact_10k' after 120 epochs. +#' Time per epoch: 3s on CPU (core i7). +#' + +library(keras) +library(readr) +library(stringr) +library(purrr) +library(tibble) +library(dplyr) + + +# Function definition ----------------------------------------------------- + +tokenize_words <- function(x){ + x <- x %>% + str_replace_all('([[:punct:]]+)', ' \\1') %>% + str_split(' ') %>% + unlist() + x[x != ""] +} + +parse_stories <- function(lines, only_supporting = FALSE){ + lines <- lines %>% + str_split(" ", n = 2) %>% + map_df(~tibble(nid = as.integer(.x[[1]]), line = .x[[2]])) + + lines <- lines %>% + mutate( + split = map(line, ~str_split(.x, "\t")[[1]]), + q = map_chr(split, ~.x[1]), + a = map_chr(split, ~.x[2]), + supporting = map(split, ~.x[3] %>% str_split(" ") %>% unlist() %>% as.integer()), + story_id = c(0, cumsum(nid[-nrow(.)] > nid[-1])) + ) %>% + select(-split) + + stories <- lines %>% + filter(is.na(a)) %>% + select(nid_story = nid, story_id, story = q) + + questions <- lines %>% + filter(!is.na(a)) %>% + select(-line) %>% + left_join(stories, by = "story_id") %>% + filter(nid_story < nid) + + if(only_supporting){ + questions <- questions %>% + filter(map2_lgl(nid_story, supporting, ~.x %in% .y)) + } + + questions %>% + group_by(story_id, nid, question = q, answer = a) %>% + summarise(story = paste(story, collapse = " ")) %>% + ungroup() %>% + mutate( + question = map(question, ~tokenize_words(.x)), + story = map(story, ~tokenize_words(.x)), + id = row_number() + ) %>% + select(id, question, answer, story) +} + +vectorize_stories <- function(data, vocab, story_maxlen, query_maxlen){ + + questions <- map(data$question, function(x){ + map_int(x, ~which(.x == vocab)) + }) + + stories <- map(data$story, function(x){ + map_int(x, ~which(.x == vocab)) + }) + + # "" represents padding + answers <- sapply(c("", vocab), function(x){ + as.integer(x == data$answer) + }) + + + list( + questions = pad_sequences(questions, maxlen = query_maxlen), + stories = pad_sequences(stories, maxlen = story_maxlen), + answers = answers + ) +} + + +# Parameters -------------------------------------------------------------- + +challenges <- list( + # QA1 with 10,000 samples + single_supporting_fact_10k = "%stasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_%s.txt", + # QA2 with 10,000 samples + two_supporting_facts_10k = "%stasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_%s.txt" +) + +challenge_type <- "single_supporting_fact_10k" +challenge <- challenges[[challenge_type]] +max_length <- 999999 + +# Data Preparation -------------------------------------------------------- + +# Download data +path <- get_file( + fname = "babi-tasks-v1-2.tar.gz", + origin = "https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz" +) +untar(path, exdir = str_replace(path, fixed(".tar.gz"), "/")) +path <- str_replace(path, fixed(".tar.gz"), "/") + +# Reading training and test data +train <- read_lines(sprintf(challenge, path, "train")) %>% + parse_stories() %>% + filter(map_int(story, ~length(.x)) <= max_length) + +test <- read_lines(sprintf(challenge, path, "test")) %>% + parse_stories() %>% + filter(map_int(story, ~length(.x)) <= max_length) + +# extract the vocabulary +all_data <- bind_rows(train, test) +vocab <- c(unlist(all_data$question), all_data$answer, + unlist(all_data$story)) %>% + unique() %>% + sort() + +# Reserve 0 for masking via pad_sequences +vocab_size <- length(vocab) + 1 +story_maxlen <- map_int(all_data$story, ~length(.x)) %>% max() +query_maxlen <- map_int(all_data$question, ~length(.x)) %>% max() + +# vectorized versions of training and test sets +train_vec <- vectorize_stories(train, vocab, story_maxlen, query_maxlen) +test_vec <- vectorize_stories(test, vocab, story_maxlen, query_maxlen) + +# Defining the model ------------------------------------------------------ + +# placeholders +sequence <- layer_input(shape = c(story_maxlen)) +question <- layer_input(shape = c(query_maxlen)) + +# encoders +# embed the input sequence into a sequence of vectors +sequence_encoder_m <- keras_model_sequential() +sequence_encoder_m %>% + layer_embedding(input_dim = vocab_size, output_dim = 64) %>% + layer_dropout(rate = 0.3) +# output: (samples, story_maxlen, embedding_dim) + +# embed the input into a sequence of vectors of size query_maxlen +sequence_encoder_c <- keras_model_sequential() +sequence_encoder_c %>% + layer_embedding(input_dim = vocab_size, output = query_maxlen) %>% + layer_dropout(rate = 0.3) +# output: (samples, story_maxlen, query_maxlen) + +# embed the question into a sequence of vectors +question_encoder <- keras_model_sequential() +question_encoder %>% + layer_embedding(input_dim = vocab_size, output_dim = 64, + input_length = query_maxlen) %>% + layer_dropout(rate = 0.3) +# output: (samples, query_maxlen, embedding_dim) + +# encode input sequence and questions (which are indices) +# to sequences of dense vectors +sequence_encoded_m <- sequence_encoder_m(sequence) +sequence_encoded_c <- sequence_encoder_c(sequence) +question_encoded <- question_encoder(question) + +# compute a 'match' between the first input vector sequence +# and the question vector sequence +# shape: `(samples, story_maxlen, query_maxlen)` +match <- list(sequence_encoded_m, question_encoded) %>% + layer_dot(axes = c(2,2)) %>% + layer_activation("softmax") + +# add the match matrix with the second input vector sequence +response <- list(match, sequence_encoded_c) %>% + layer_add() %>% + layer_permute(c(2,1)) + +# concatenate the match matrix with the question vector sequence +answer <- list(response, question_encoded) %>% + layer_concatenate() %>% + # the original paper uses a matrix multiplication for this reduction step. + # we choose to use a RNN instead. + layer_lstm(32) %>% + # one regularization layer -- more would probably be needed. + layer_dropout(rate = 0.3) %>% + layer_dense(vocab_size) %>% + # we output a probability distribution over the vocabulary + layer_activation("softmax") + +# build the final model +model <- keras_model(inputs = list(sequence, question), answer) +model %>% compile( + optimizer = "rmsprop", + loss = "categorical_crossentropy", + metrics = "accuracy" +) + + +# Training ---------------------------------------------------------------- + +model %>% fit( + x = list(train_vec$stories, train_vec$questions), + y = train_vec$answers, + batch_size = 32, + epochs = 120, + validation_data = list(list(test_vec$stories, test_vec$questions), test_vec$answers) +) + + + + + diff --git a/website/articles/examples/babi_memnn.html b/website/articles/examples/babi_memnn.html new file mode 100644 index 000000000..a81ccad46 --- /dev/null +++ b/website/articles/examples/babi_memnn.html @@ -0,0 +1,352 @@ + + + + + + + +babi_memnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Trains a memory network on the bAbI dataset.

+

References:

+
    +
  • Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov, Alexander M. Rush, “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”, http://arxiv.org/abs/1502.05698

  • +
  • Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus, “End-To-End Memory Networks”, http://arxiv.org/abs/1503.08895

  • +
+

Reaches 98.6% accuracy on task ‘single_supporting_fact_10k’ after 120 epochs. Time per epoch: 3s on CPU (core i7).

+
library(keras)
+library(readr)
+library(stringr)
+library(purrr)
+library(tibble)
+library(dplyr)
+
+
+# Function definition -----------------------------------------------------
+
+tokenize_words <- function(x){
+  x <- x %>% 
+    str_replace_all('([[:punct:]]+)', ' \\1') %>% 
+    str_split(' ') %>%
+    unlist()
+  x[x != ""]
+}
+
+parse_stories <- function(lines, only_supporting = FALSE){
+  lines <- lines %>% 
+    str_split(" ", n = 2) %>%
+    map_df(~tibble(nid = as.integer(.x[[1]]), line = .x[[2]]))
+  
+  lines <- lines %>%
+    mutate(
+      split = map(line, ~str_split(.x, "\t")[[1]]),
+      q = map_chr(split, ~.x[1]),
+      a = map_chr(split, ~.x[2]),
+      supporting = map(split, ~.x[3] %>% str_split(" ") %>% unlist() %>% as.integer()),
+      story_id = c(0, cumsum(nid[-nrow(.)] > nid[-1]))
+    ) %>%
+    select(-split)
+  
+  stories <- lines %>%
+    filter(is.na(a)) %>%
+    select(nid_story = nid, story_id, story = q)
+  
+  questions <- lines %>%
+    filter(!is.na(a)) %>%
+    select(-line) %>%
+    left_join(stories, by = "story_id") %>%
+    filter(nid_story < nid)
+  
+  if(only_supporting){
+    questions <- questions %>%
+      filter(map2_lgl(nid_story, supporting, ~.x %in% .y))
+  }
+  
+  questions %>%
+    group_by(story_id, nid, question = q, answer = a) %>%
+    summarise(story = paste(story, collapse = " ")) %>%
+    ungroup() %>% 
+    mutate(
+      question = map(question, ~tokenize_words(.x)),
+      story = map(story, ~tokenize_words(.x)),
+      id = row_number()
+    ) %>%
+    select(id, question, answer, story)
+}
+
+vectorize_stories <- function(data, vocab, story_maxlen, query_maxlen){
+  
+  questions <- map(data$question, function(x){
+    map_int(x, ~which(.x == vocab))
+  })
+  
+  stories <- map(data$story, function(x){
+    map_int(x, ~which(.x == vocab))
+  })
+  
+  # "" represents padding
+  answers <- sapply(c("", vocab), function(x){
+    as.integer(x == data$answer)
+  })
+  
+  
+  list(
+    questions = pad_sequences(questions, maxlen = query_maxlen),
+    stories   = pad_sequences(stories, maxlen = story_maxlen),
+    answers   = answers
+  )
+}
+
+
+# Parameters --------------------------------------------------------------
+
+challenges <- list(
+  # QA1 with 10,000 samples
+  single_supporting_fact_10k = "%stasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_%s.txt",
+  # QA2 with 10,000 samples
+  two_supporting_facts_10k = "%stasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_%s.txt"
+)
+
+challenge_type <- "single_supporting_fact_10k"
+challenge <- challenges[[challenge_type]]
+max_length <- 999999
+
+# Data Preparation --------------------------------------------------------
+
+# Download data
+path <- get_file(
+  fname = "babi-tasks-v1-2.tar.gz",
+  origin = "https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz"
+)
+untar(path, exdir = str_replace(path, fixed(".tar.gz"), "/"))
+path <- str_replace(path, fixed(".tar.gz"), "/")
+
+# Reading training and test data
+train <- read_lines(sprintf(challenge, path, "train")) %>%
+  parse_stories() %>%
+  filter(map_int(story, ~length(.x)) <= max_length)
+
+test <- read_lines(sprintf(challenge, path, "test")) %>%
+  parse_stories() %>%
+  filter(map_int(story, ~length(.x)) <= max_length)
+
+# extract the vocabulary
+all_data <- bind_rows(train, test)
+vocab <- c(unlist(all_data$question), all_data$answer, 
+           unlist(all_data$story)) %>%
+  unique() %>%
+  sort()
+
+# Reserve 0 for masking via pad_sequences
+vocab_size <- length(vocab) + 1
+story_maxlen <- map_int(all_data$story, ~length(.x)) %>% max()
+query_maxlen <- map_int(all_data$question, ~length(.x)) %>% max()
+
+# vectorized versions of training and test sets
+train_vec <- vectorize_stories(train, vocab, story_maxlen, query_maxlen)
+test_vec <- vectorize_stories(test, vocab, story_maxlen, query_maxlen)
+
+# Defining the model ------------------------------------------------------
+
+# placeholders
+sequence <- layer_input(shape = c(story_maxlen))
+question <- layer_input(shape = c(query_maxlen))
+
+# encoders
+# embed the input sequence into a sequence of vectors
+sequence_encoder_m <- keras_model_sequential()
+sequence_encoder_m %>%
+  layer_embedding(input_dim = vocab_size, output_dim = 64) %>%
+  layer_dropout(rate = 0.3)
+# output: (samples, story_maxlen, embedding_dim)
+
+# embed the input into a sequence of vectors of size query_maxlen
+sequence_encoder_c <- keras_model_sequential()
+sequence_encoder_c %>%
+  layer_embedding(input_dim = vocab_size, output = query_maxlen) %>%
+  layer_dropout(rate = 0.3)
+# output: (samples, story_maxlen, query_maxlen)
+
+# embed the question into a sequence of vectors
+question_encoder <- keras_model_sequential()
+question_encoder %>%
+  layer_embedding(input_dim = vocab_size, output_dim = 64, 
+                  input_length = query_maxlen) %>%
+  layer_dropout(rate = 0.3)
+# output: (samples, query_maxlen, embedding_dim)
+
+# encode input sequence and questions (which are indices)
+# to sequences of dense vectors
+sequence_encoded_m <- sequence_encoder_m(sequence)
+sequence_encoded_c <- sequence_encoder_c(sequence)
+question_encoded <- question_encoder(question)
+
+# compute a 'match' between the first input vector sequence
+# and the question vector sequence
+# shape: `(samples, story_maxlen, query_maxlen)`
+match <- list(sequence_encoded_m, question_encoded) %>%
+  layer_dot(axes = c(2,2)) %>%
+  layer_activation("softmax")
+
+# add the match matrix with the second input vector sequence
+response <- list(match, sequence_encoded_c) %>%
+  layer_add() %>%
+  layer_permute(c(2,1))
+
+# concatenate the match matrix with the question vector sequence
+answer <- list(response, question_encoded) %>%
+  layer_concatenate() %>%
+  # the original paper uses a matrix multiplication for this reduction step.
+  # we choose to use a RNN instead.
+  layer_lstm(32) %>%
+  # one regularization layer -- more would probably be needed.
+  layer_dropout(rate = 0.3) %>%
+  layer_dense(vocab_size) %>%
+  # we output a probability distribution over the vocabulary
+  layer_activation("softmax")
+
+# build the final model
+model <- keras_model(inputs = list(sequence, question), answer)
+model %>% compile(
+  optimizer = "rmsprop",
+  loss = "categorical_crossentropy",
+  metrics = "accuracy"
+)
+
+
+# Training ----------------------------------------------------------------
+
+model %>% fit(
+  x = list(train_vec$stories, train_vec$questions),
+  y = train_vec$answers,
+  batch_size = 32,
+  epochs = 120,
+  validation_data = list(list(test_vec$stories, test_vec$questions), test_vec$answers)
+)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/babi_rnn.R b/website/articles/examples/babi_rnn.R new file mode 100644 index 000000000..39ee7807d --- /dev/null +++ b/website/articles/examples/babi_rnn.R @@ -0,0 +1,238 @@ +#' Trains two recurrent neural networks based upon a story and a question. +#' The resulting merged vector is then queried to answer a range of bAbI tasks. +#' +#' The results are comparable to those for an LSTM model provided in Weston et al.: +#' "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks" +#' http://arxiv.org/abs/1502.05698 +#' +#' Task Number | FB LSTM Baseline | Keras QA +#' --- | --- | --- +#' QA1 - Single Supporting Fact | 50 | 100.0 +#' QA2 - Two Supporting Facts | 20 | 50.0 +#' QA3 - Three Supporting Facts | 20 | 20.5 +#' QA4 - Two Arg. Relations | 61 | 62.9 +#' QA5 - Three Arg. Relations | 70 | 61.9 +#' QA6 - yes/No Questions | 48 | 50.7 +#' QA7 - Counting | 49 | 78.9 +#' QA8 - Lists/Sets | 45 | 77.2 +#' QA9 - Simple Negation | 64 | 64.0 +#' QA10 - Indefinite Knowledge | 44 | 47.7 +#' QA11 - Basic Coreference | 72 | 74.9 +#' QA12 - Conjunction | 74 | 76.4 +#' QA13 - Compound Coreference | 94 | 94.4 +#' QA14 - Time Reasoning | 27 | 34.8 +#' QA15 - Basic Deduction | 21 | 32.4 +#' QA16 - Basic Induction | 23 | 50.6 +#' QA17 - Positional Reasoning | 51 | 49.1 +#' QA18 - Size Reasoning | 52 | 90.8 +#' QA19 - Path Finding | 8 | 9.0 +#' QA20 - Agent's Motivations | 91 | 90.7 +#' +#' For the resources related to the bAbI project, refer to: +#' https://research.facebook.com/researchers/1543934539189348 +#' +#' Notes: +#' +#' - With default word, sentence, and query vector sizes, the GRU model achieves: +#' - 100% test accuracy on QA1 in 20 epochs (2 seconds per epoch on CPU) +#' - 50% test accuracy on QA2 in 20 epochs (16 seconds per epoch on CPU) +#' In comparison, the Facebook paper achieves 50% and 20% for the LSTM baseline. +#' +#' - The task does not traditionally parse the question separately. This likely +#' improves accuracy and is a good example of merging two RNNs. +#' +#' - The word vector embeddings are not shared between the story and question RNNs. +#' +#' - See how the accuracy changes given 10,000 training samples (en-10k) instead +#' of only 1000. 1000 was used in order to be comparable to the original paper. +#' +#' - Experiment with GRU, LSTM, and JZS1-3 as they give subtly different results. +#' +#' - The length and noise (i.e. 'useless' story components) impact the ability for +#' LSTMs / GRUs to provide the correct answer. Given only the supporting facts, +#' these RNNs can achieve 100% accuracy on many tasks. Memory networks and neural +#' networks that use attentional processes can efficiently search through this +#' noise to find the relevant statements, improving performance substantially. +#' This becomes especially obvious on QA2 and QA3, both far longer than QA1. +#' + +library(keras) +library(readr) +library(stringr) +library(purrr) +library(tibble) +library(dplyr) + +# Function definition ----------------------------------------------------- + +tokenize_words <- function(x){ + x <- x %>% + str_replace_all('([[:punct:]]+)', ' \\1') %>% + str_split(' ') %>% + unlist() + x[x != ""] +} + +parse_stories <- function(lines, only_supporting = FALSE){ + lines <- lines %>% + str_split(" ", n = 2) %>% + map_df(~tibble(nid = as.integer(.x[[1]]), line = .x[[2]])) + + lines <- lines %>% + mutate( + split = map(line, ~str_split(.x, "\t")[[1]]), + q = map_chr(split, ~.x[1]), + a = map_chr(split, ~.x[2]), + supporting = map(split, ~.x[3] %>% str_split(" ") %>% unlist() %>% as.integer()), + story_id = c(0, cumsum(nid[-nrow(.)] > nid[-1])) + ) %>% + select(-split) + + stories <- lines %>% + filter(is.na(a)) %>% + select(nid_story = nid, story_id, story = q) + + questions <- lines %>% + filter(!is.na(a)) %>% + select(-line) %>% + left_join(stories, by = "story_id") %>% + filter(nid_story < nid) + + if(only_supporting){ + questions <- questions %>% + filter(map2_lgl(nid_story, supporting, ~.x %in% .y)) + } + + questions %>% + group_by(story_id, nid, question = q, answer = a) %>% + summarise(story = paste(story, collapse = " ")) %>% + ungroup() %>% + mutate( + question = map(question, ~tokenize_words(.x)), + story = map(story, ~tokenize_words(.x)), + id = row_number() + ) %>% + select(id, question, answer, story) +} + +vectorize_stories <- function(data, vocab, story_maxlen, query_maxlen){ + + questions <- map(data$question, function(x){ + map_int(x, ~which(.x == vocab)) + }) + + stories <- map(data$story, function(x){ + map_int(x, ~which(.x == vocab)) + }) + + # "" represents padding + answers <- sapply(c("", vocab), function(x){ + as.integer(x == data$answer) + }) + + + list( + questions = pad_sequences(questions, maxlen = query_maxlen), + stories = pad_sequences(stories, maxlen = story_maxlen), + answers = answers + ) +} + +# Parameters -------------------------------------------------------------- + +max_length <- 99999 +embed_hidden_size <- 50 +batch_size <- 32 +epochs <- 40 + +# Data Preparation -------------------------------------------------------- + +path <- get_file( + fname = "babi-tasks-v1-2.tar.gz", + origin = "https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz" +) +untar(path, exdir = str_replace(path, fixed(".tar.gz"), "/")) +path <- str_replace(path, fixed(".tar.gz"), "/") + +# Default QA1 with 1000 samples +# challenge = '%stasks_1-20_v1-2/en/qa1_single-supporting-fact_%s.txt' +# QA1 with 10,000 samples +# challenge = '%stasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_%s.txt' +# QA2 with 1000 samples +challenge <- "%stasks_1-20_v1-2/en/qa2_two-supporting-facts_%s.txt" +# QA2 with 10,000 samples +# challenge = '%stasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_%s.txt' + +train <- read_lines(sprintf(challenge, path, "train")) %>% + parse_stories() %>% + filter(map_int(story, ~length(.x)) <= max_length) + +test <- read_lines(sprintf(challenge, path, "test")) %>% + parse_stories() %>% + filter(map_int(story, ~length(.x)) <= max_length) + +# extract the vocabulary +all_data <- bind_rows(train, test) +vocab <- c(unlist(all_data$question), all_data$answer, + unlist(all_data$story)) %>% + unique() %>% + sort() + +# Reserve 0 for masking via pad_sequences +vocab_size <- length(vocab) + 1 +story_maxlen <- map_int(all_data$story, ~length(.x)) %>% max() +query_maxlen <- map_int(all_data$question, ~length(.x)) %>% max() + +# vectorized versions of training and test sets +train_vec <- vectorize_stories(train, vocab, story_maxlen, query_maxlen) +test_vec <- vectorize_stories(test, vocab, story_maxlen, query_maxlen) + +# Defining the model ------------------------------------------------------ + +sentence <- layer_input(shape = c(story_maxlen), dtype = "int32") +encoded_sentence <- sentence %>% + layer_embedding(input_dim = vocab_size, output_dim = embed_hidden_size) %>% + layer_dropout(rate = 0.3) + +question <- layer_input(shape = c(query_maxlen), dtype = "int32") +encoded_question <- question %>% + layer_embedding(input_dim = vocab_size, output_dim = embed_hidden_size) %>% + layer_dropout(rate = 0.3) %>% + layer_lstm(units = embed_hidden_size) %>% + layer_repeat_vector(n = story_maxlen) + +merged <- list(encoded_sentence, encoded_question) %>% + layer_add() %>% + layer_lstm(units = embed_hidden_size) %>% + layer_dropout(rate = 0.3) + +preds <- merged %>% + layer_dense(units = vocab_size, activation = "softmax") + +model <- keras_model(inputs = list(sentence, question), outputs = preds) +model %>% compile( + optimizer = "adam", + loss = "categorical_crossentropy", + metrics = "accuracy" +) + +model + +# Training ---------------------------------------------------------------- + +model %>% fit( + x = list(train_vec$stories, train_vec$questions), + y = train_vec$answers, + batch_size = batch_size, + epochs = epochs, + validation_split=0.05 +) + +evaluation <- model %>% evaluate( + x = list(test_vec$stories, test_vec$questions), + y = test_vec$answers, + batch_size = batch_size +) + +evaluation + diff --git a/website/articles/examples/babi_rnn.html b/website/articles/examples/babi_rnn.html new file mode 100644 index 000000000..6f2fd4d9d --- /dev/null +++ b/website/articles/examples/babi_rnn.html @@ -0,0 +1,438 @@ + + + + + + + +babi_rnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Trains two recurrent neural networks based upon a story and a question. The resulting merged vector is then queried to answer a range of bAbI tasks.

+

The results are comparable to those for an LSTM model provided in Weston et al.: “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks” http://arxiv.org/abs/1502.05698

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Task NumberFB LSTM BaselineKeras QA
QA1 - Single Supporting Fact50100.0
QA2 - Two Supporting Facts2050.0
QA3 - Three Supporting Facts2020.5
QA4 - Two Arg. Relations6162.9
QA5 - Three Arg. Relations7061.9
QA6 - yes/No Questions4850.7
QA7 - Counting4978.9
QA8 - Lists/Sets4577.2
QA9 - Simple Negation6464.0
QA10 - Indefinite Knowledge4447.7
QA11 - Basic Coreference7274.9
QA12 - Conjunction7476.4
QA13 - Compound Coreference9494.4
QA14 - Time Reasoning2734.8
QA15 - Basic Deduction2132.4
QA16 - Basic Induction2350.6
QA17 - Positional Reasoning5149.1
QA18 - Size Reasoning5290.8
QA19 - Path Finding89.0
QA20 - Agent’s Motivations9190.7
+

For the resources related to the bAbI project, refer to: https://research.facebook.com/researchers/1543934539189348

+

Notes:

+
    +
  • With default word, sentence, and query vector sizes, the GRU model achieves:
  • +
  • 100% test accuracy on QA1 in 20 epochs (2 seconds per epoch on CPU)
  • +
  • 50% test accuracy on QA2 in 20 epochs (16 seconds per epoch on CPU) In comparison, the Facebook paper achieves 50% and 20% for the LSTM baseline.

  • +
  • The task does not traditionally parse the question separately. This likely improves accuracy and is a good example of merging two RNNs.

  • +
  • The word vector embeddings are not shared between the story and question RNNs.

  • +
  • See how the accuracy changes given 10,000 training samples (en-10k) instead of only 1000. 1000 was used in order to be comparable to the original paper.

  • +
  • Experiment with GRU, LSTM, and JZS1-3 as they give subtly different results.

  • +
  • The length and noise (i.e. ‘useless’ story components) impact the ability for LSTMs / GRUs to provide the correct answer. Given only the supporting facts, these RNNs can achieve 100% accuracy on many tasks. Memory networks and neural networks that use attentional processes can efficiently search through this noise to find the relevant statements, improving performance substantially. This becomes especially obvious on QA2 and QA3, both far longer than QA1.

  • +
+
library(keras)
+library(readr)
+library(stringr)
+library(purrr)
+library(tibble)
+library(dplyr)
+
+# Function definition -----------------------------------------------------
+
+tokenize_words <- function(x){
+  x <- x %>% 
+    str_replace_all('([[:punct:]]+)', ' \\1') %>% 
+    str_split(' ') %>%
+    unlist()
+  x[x != ""]
+}
+
+parse_stories <- function(lines, only_supporting = FALSE){
+  lines <- lines %>% 
+    str_split(" ", n = 2) %>%
+    map_df(~tibble(nid = as.integer(.x[[1]]), line = .x[[2]]))
+  
+  lines <- lines %>%
+    mutate(
+      split = map(line, ~str_split(.x, "\t")[[1]]),
+      q = map_chr(split, ~.x[1]),
+      a = map_chr(split, ~.x[2]),
+      supporting = map(split, ~.x[3] %>% str_split(" ") %>% unlist() %>% as.integer()),
+      story_id = c(0, cumsum(nid[-nrow(.)] > nid[-1]))
+    ) %>%
+    select(-split)
+  
+  stories <- lines %>%
+    filter(is.na(a)) %>%
+    select(nid_story = nid, story_id, story = q)
+  
+  questions <- lines %>%
+    filter(!is.na(a)) %>%
+    select(-line) %>%
+    left_join(stories, by = "story_id") %>%
+    filter(nid_story < nid)
+
+  if(only_supporting){
+    questions <- questions %>%
+      filter(map2_lgl(nid_story, supporting, ~.x %in% .y))
+  }
+    
+  questions %>%
+    group_by(story_id, nid, question = q, answer = a) %>%
+    summarise(story = paste(story, collapse = " ")) %>%
+    ungroup() %>% 
+    mutate(
+      question = map(question, ~tokenize_words(.x)),
+      story = map(story, ~tokenize_words(.x)),
+      id = row_number()
+    ) %>%
+    select(id, question, answer, story)
+}
+
+vectorize_stories <- function(data, vocab, story_maxlen, query_maxlen){
+  
+  questions <- map(data$question, function(x){
+    map_int(x, ~which(.x == vocab))
+  })
+  
+  stories <- map(data$story, function(x){
+    map_int(x, ~which(.x == vocab))
+  })
+  
+  # "" represents padding
+  answers <- sapply(c("", vocab), function(x){
+    as.integer(x == data$answer)
+  })
+  
+
+  list(
+    questions = pad_sequences(questions, maxlen = query_maxlen),
+    stories   = pad_sequences(stories, maxlen = story_maxlen),
+    answers   = answers
+  )
+}
+
+# Parameters --------------------------------------------------------------
+
+max_length <- 99999
+embed_hidden_size <- 50
+batch_size <- 32
+epochs <- 40
+
+# Data Preparation --------------------------------------------------------
+
+path <- get_file(
+  fname = "babi-tasks-v1-2.tar.gz",
+  origin = "https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz"
+)
+untar(path, exdir = str_replace(path, fixed(".tar.gz"), "/"))
+path <- str_replace(path, fixed(".tar.gz"), "/")
+
+# Default QA1 with 1000 samples
+# challenge = '%stasks_1-20_v1-2/en/qa1_single-supporting-fact_%s.txt'
+# QA1 with 10,000 samples
+# challenge = '%stasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_%s.txt'
+# QA2 with 1000 samples
+challenge <- "%stasks_1-20_v1-2/en/qa2_two-supporting-facts_%s.txt"
+# QA2 with 10,000 samples
+# challenge = '%stasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_%s.txt'
+
+train <- read_lines(sprintf(challenge, path, "train")) %>%
+  parse_stories() %>%
+  filter(map_int(story, ~length(.x)) <= max_length)
+
+test <- read_lines(sprintf(challenge, path, "test")) %>%
+  parse_stories() %>%
+  filter(map_int(story, ~length(.x)) <= max_length)
+
+# extract the vocabulary
+all_data <- bind_rows(train, test)
+vocab <- c(unlist(all_data$question), all_data$answer, 
+           unlist(all_data$story)) %>%
+  unique() %>%
+  sort()
+
+# Reserve 0 for masking via pad_sequences
+vocab_size <- length(vocab) + 1
+story_maxlen <- map_int(all_data$story, ~length(.x)) %>% max()
+query_maxlen <- map_int(all_data$question, ~length(.x)) %>% max()
+
+# vectorized versions of training and test sets
+train_vec <- vectorize_stories(train, vocab, story_maxlen, query_maxlen)
+test_vec <- vectorize_stories(test, vocab, story_maxlen, query_maxlen)
+
+# Defining the model ------------------------------------------------------
+
+sentence <- layer_input(shape = c(story_maxlen), dtype = "int32")
+encoded_sentence <- sentence %>% 
+  layer_embedding(input_dim = vocab_size, output_dim = embed_hidden_size) %>%
+  layer_dropout(rate = 0.3)
+
+question <- layer_input(shape = c(query_maxlen), dtype = "int32")
+encoded_question <- question %>%
+  layer_embedding(input_dim = vocab_size, output_dim = embed_hidden_size) %>%
+  layer_dropout(rate = 0.3) %>%
+  layer_lstm(units = embed_hidden_size) %>%
+  layer_repeat_vector(n = story_maxlen)
+
+merged <- list(encoded_sentence, encoded_question) %>%
+  layer_add() %>%
+  layer_lstm(units = embed_hidden_size) %>%
+  layer_dropout(rate = 0.3)
+
+preds <- merged %>%
+  layer_dense(units = vocab_size, activation = "softmax")
+
+model <- keras_model(inputs = list(sentence, question), outputs = preds)
+model %>% compile(
+  optimizer = "adam",
+  loss = "categorical_crossentropy",
+  metrics = "accuracy"
+)
+
+model
+
+# Training ----------------------------------------------------------------
+
+model %>% fit(
+  x = list(train_vec$stories, train_vec$questions),
+  y = train_vec$answers,
+  batch_size = batch_size,
+  epochs = epochs,
+  validation_split=0.05
+)
+
+evaluation <- model %>% evaluate(
+  x = list(test_vec$stories, test_vec$questions),
+  y = test_vec$answers,
+  batch_size = batch_size
+)
+
+evaluation
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/cifar10_cnn.R b/website/articles/examples/cifar10_cnn.R new file mode 100644 index 000000000..d2210c11c --- /dev/null +++ b/website/articles/examples/cifar10_cnn.R @@ -0,0 +1,95 @@ +#' Train a simple deep CNN on the CIFAR10 small images dataset. +#' +#' It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs. +#' (it's still underfitting at that point, though). + +library(keras) + +# Parameters -------------------------------------------------------------- + +batch_size <- 32 +epochs <- 200 +data_augmentation <- TRUE + + +# Data Preparation -------------------------------------------------------- + +# see ?dataset_cifar10 for more info +cifar10 <- dataset_cifar10() + +x_train <- cifar10$train$x/255 +x_test <- cifar10$test$x/255 +y_train <- to_categorical(cifar10$train$y, num_classes = 10) +y_test <- to_categorical(cifar10$test$y, num_classes = 10) + +# Defining the model ------------------------------------------------------ + +model <- keras_model_sequential() + +model %>% + layer_conv_2d( + filter = 32, kernel_size = c(3,3), padding = "same", + input_shape = c(32, 32, 3) + ) %>% + layer_activation("relu") %>% + layer_conv_2d(filter = 32, kernel_size = c(3,3)) %>% + layer_activation("relu") %>% + layer_max_pooling_2d(pool_size = c(2,2)) %>% + layer_dropout(0.25) %>% + + layer_conv_2d(filter = 32, kernel_size = c(3,3), padding = "same") %>% + layer_activation("relu") %>% + layer_conv_2d(filter = 32, kernel_size = c(3,3)) %>% + layer_activation("relu") %>% + layer_max_pooling_2d(pool_size = c(2,2)) %>% + layer_dropout(0.25) %>% + + layer_flatten() %>% + layer_dense(512) %>% + layer_activation("relu") %>% + layer_dropout(0.5) %>% + layer_dense(10) %>% + layer_activation("softmax") + +opt <- optimizer_rmsprop(lr = 0.0001, decay = 1e-6) + +model %>% compile( + loss = "categorical_crossentropy", + optimizer = opt, + metrics = "accuracy" +) + + +# Training ---------------------------------------------------------------- + +if(!data_augmentation){ + + model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = epochs, + validation_data = list(x_test, y_test), + shuffle = TRUE + ) + +} else { + + datagen <- image_data_generator( + featurewise_center = TRUE, + featurewise_std_normalization = TRUE, + rotation_range = 20, + width_shift_range = 0.2, + height_shift_range = 0.2, + horizontal_flip = TRUE + ) + + datagen %>% fit_image_data_generator(x_train) + + model %>% fit_generator( + flow_images_from_data(x_train, y_train, datagen, batch_size = batch_size), + steps_per_epoch = as.integer(50000/batch_size), + epochs = epochs, + validation_data = list(x_test, y_test) + ) + +} diff --git a/website/articles/examples/cifar10_cnn.html b/website/articles/examples/cifar10_cnn.html new file mode 100644 index 000000000..7d86f2f8b --- /dev/null +++ b/website/articles/examples/cifar10_cnn.html @@ -0,0 +1,228 @@ + + + + + + + +cifar10_cnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Train a simple deep CNN on the CIFAR10 small images dataset.

+

It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs. (it’s still underfitting at that point, though).

+
library(keras)
+
+# Parameters --------------------------------------------------------------
+
+batch_size <- 32
+epochs <- 200
+data_augmentation <- TRUE
+
+
+# Data Preparation --------------------------------------------------------
+
+# see ?dataset_cifar10 for more info
+cifar10 <- dataset_cifar10()
+
+x_train <- cifar10$train$x/255
+x_test <- cifar10$test$x/255
+y_train <- to_categorical(cifar10$train$y, num_classes = 10)
+y_test <- to_categorical(cifar10$test$y, num_classes = 10)
+
+# Defining the model ------------------------------------------------------
+
+model <- keras_model_sequential()
+
+model %>%
+  layer_conv_2d(
+    filter = 32, kernel_size = c(3,3), padding = "same", 
+    input_shape = c(32, 32, 3)
+  ) %>%
+  layer_activation("relu") %>%
+  layer_conv_2d(filter = 32, kernel_size = c(3,3)) %>%
+  layer_activation("relu") %>%
+  layer_max_pooling_2d(pool_size = c(2,2)) %>%
+  layer_dropout(0.25) %>%
+  
+  layer_conv_2d(filter = 32, kernel_size = c(3,3), padding = "same") %>%
+  layer_activation("relu") %>%
+  layer_conv_2d(filter = 32, kernel_size = c(3,3)) %>%
+  layer_activation("relu") %>%
+  layer_max_pooling_2d(pool_size = c(2,2)) %>%
+  layer_dropout(0.25) %>%
+  
+  layer_flatten() %>%
+  layer_dense(512) %>%
+  layer_activation("relu") %>%
+  layer_dropout(0.5) %>%
+  layer_dense(10) %>%
+  layer_activation("softmax")
+
+opt <- optimizer_rmsprop(lr = 0.0001, decay = 1e-6)
+
+model %>% compile(
+  loss = "categorical_crossentropy",
+  optimizer = opt,
+  metrics = "accuracy"
+)
+
+
+# Training ----------------------------------------------------------------
+
+if(!data_augmentation){
+  
+  model %>% fit(
+    x_train, y_train,
+    batch_size = batch_size,
+    epochs = epochs,
+    validation_data = list(x_test, y_test),
+    shuffle = TRUE
+  )
+  
+} else {
+  
+  datagen <- image_data_generator(
+    featurewise_center = TRUE,
+    featurewise_std_normalization = TRUE,
+    rotation_range = 20,
+    width_shift_range = 0.2,
+    height_shift_range = 0.2,
+    horizontal_flip = TRUE
+  )
+  
+  datagen %>% fit_image_data_generator(x_train)
+  
+  model %>% fit_generator(
+    flow_images_from_data(x_train, y_train, datagen, batch_size = batch_size),
+    steps_per_epoch = as.integer(50000/batch_size), 
+    epochs = epochs, 
+    validation_data = list(x_test, y_test)
+  )
+  
+}
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/conv_filter_visualization.R b/website/articles/examples/conv_filter_visualization.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/conv_filter_visualization.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/conv_filter_visualization.html b/website/articles/examples/conv_filter_visualization.html new file mode 100644 index 000000000..37489a8d5 --- /dev/null +++ b/website/articles/examples/conv_filter_visualization.html @@ -0,0 +1,137 @@ + + + + + + + +conv_filter_visualization • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/conv_filter_visualization.py b/website/articles/examples/conv_filter_visualization.py new file mode 100755 index 000000000..b85b86797 --- /dev/null +++ b/website/articles/examples/conv_filter_visualization.py @@ -0,0 +1,135 @@ +'''Visualization of the filters of VGG16, via gradient ascent in input space. + +This script can run on CPU in a few minutes (with the TensorFlow backend). + +Results example: http://i.imgur.com/4nj4KjN.jpg +''' +from __future__ import print_function + +from scipy.misc import imsave +import numpy as np +import time +from keras.applications import vgg16 +from keras import backend as K + +# dimensions of the generated pictures for each filter. +img_width = 128 +img_height = 128 + +# the name of the layer we want to visualize +# (see model definition at keras/applications/vgg16.py) +layer_name = 'block5_conv1' + +# util function to convert a tensor into a valid image + + +def deprocess_image(x): + # normalize tensor: center on 0., ensure std is 0.1 + x -= x.mean() + x /= (x.std() + 1e-5) + x *= 0.1 + + # clip to [0, 1] + x += 0.5 + x = np.clip(x, 0, 1) + + # convert to RGB array + x *= 255 + if K.image_data_format() == 'channels_first': + x = x.transpose((1, 2, 0)) + x = np.clip(x, 0, 255).astype('uint8') + return x + +# build the VGG16 network with ImageNet weights +model = vgg16.VGG16(weights='imagenet', include_top=False) +print('Model loaded.') + +model.summary() + +# this is the placeholder for the input images +input_img = model.input + +# get the symbolic outputs of each "key" layer (we gave them unique names). +layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]]) + + +def normalize(x): + # utility function to normalize a tensor by its L2 norm + return x / (K.sqrt(K.mean(K.square(x))) + 1e-5) + + +kept_filters = [] +for filter_index in range(0, 200): + # we only scan through the first 200 filters, + # but there are actually 512 of them + print('Processing filter %d' % filter_index) + start_time = time.time() + + # we build a loss function that maximizes the activation + # of the nth filter of the layer considered + layer_output = layer_dict[layer_name].output + if K.image_data_format() == 'channels_first': + loss = K.mean(layer_output[:, filter_index, :, :]) + else: + loss = K.mean(layer_output[:, :, :, filter_index]) + + # we compute the gradient of the input picture wrt this loss + grads = K.gradients(loss, input_img)[0] + + # normalization trick: we normalize the gradient + grads = normalize(grads) + + # this function returns the loss and grads given the input picture + iterate = K.function([input_img], [loss, grads]) + + # step size for gradient ascent + step = 1. + + # we start from a gray image with some random noise + if K.image_data_format() == 'channels_first': + input_img_data = np.random.random((1, 3, img_width, img_height)) + else: + input_img_data = np.random.random((1, img_width, img_height, 3)) + input_img_data = (input_img_data - 0.5) * 20 + 128 + + # we run gradient ascent for 20 steps + for i in range(20): + loss_value, grads_value = iterate([input_img_data]) + input_img_data += grads_value * step + + print('Current loss value:', loss_value) + if loss_value <= 0.: + # some filters get stuck to 0, we can skip them + break + + # decode the resulting input image + if loss_value > 0: + img = deprocess_image(input_img_data[0]) + kept_filters.append((img, loss_value)) + end_time = time.time() + print('Filter %d processed in %ds' % (filter_index, end_time - start_time)) + +# we will stich the best 64 filters on a 8 x 8 grid. +n = 8 + +# the filters that have the highest loss are assumed to be better-looking. +# we will only keep the top 64 filters. +kept_filters.sort(key=lambda x: x[1], reverse=True) +kept_filters = kept_filters[:n * n] + +# build a black picture with enough space for +# our 8 x 8 filters of size 128 x 128, with a 5px margin in between +margin = 5 +width = n * img_width + (n - 1) * margin +height = n * img_height + (n - 1) * margin +stitched_filters = np.zeros((width, height, 3)) + +# fill the picture with our saved filters +for i in range(n): + for j in range(n): + img, loss = kept_filters[i * n + j] + stitched_filters[(img_width + margin) * i: (img_width + margin) * i + img_width, + (img_height + margin) * j: (img_height + margin) * j + img_height, :] = img + +# save the result to disk +imsave('stitched_filters_%dx%d.png' % (n, n), stitched_filters) diff --git a/website/articles/examples/conv_lstm.R b/website/articles/examples/conv_lstm.R new file mode 100644 index 000000000..dd15dc9a2 --- /dev/null +++ b/website/articles/examples/conv_lstm.R @@ -0,0 +1,187 @@ +# This script demonstrates the use of a convolutional LSTM network. +# This network is used to predict the next frame of an artificially +# generated movie which contains moving squares. +library(keras) +library(abind) +library(raster) + +# Function Definition ----------------------------------------------------- + +generate_movies <- function(n_samples = 1200, n_frames = 15){ + + rows <- 80 + cols <- 80 + + noisy_movies <- array(0, dim = c(n_samples, n_frames, rows, cols)) + shifted_movies <- array(0, dim = c(n_samples, n_frames, rows, cols)) + + n <- sample(3:8, 1) + + for(s in 1:n_samples){ + for(i in 1:n){ + # Initial position + xstart <- sample(20:60, 1) + ystart <- sample(20:60, 1) + + # Direction of motion + directionx <- sample(-1:1, 1) + directiony <- sample(-1:1, 1) + + # Size of the square + w <- sample(2:3, 1) + + x_shift <- xstart + directionx*(0:(n_frames)) + y_shift <- ystart + directiony*(0:(n_frames)) + + for(t in 1:n_frames){ + square_x <- (x_shift[t] - w):(x_shift[t] + w) + square_y <- (y_shift[t] - w):(y_shift[t] + w) + + noisy_movies[s, t, square_x, square_y] <- + noisy_movies[s, t, square_x, square_y] + 1 + + # Make it more robust by adding noise. + # The idea is that if during inference, + # the value of the pixel is not exactly one, + # we need to train the network to be robust and still + # consider it as a pixel belonging to a square. + if(runif(1) > 0.5){ + noise_f <- sample(c(-1, 1), 1) + + square_x_n <- (x_shift[t] - w - 1):(x_shift[t] + w + 1) + square_y_n <- (y_shift[t] - w - 1):(y_shift[t] + w + 1) + + noisy_movies[s, t, square_x_n, square_y_n] <- + noisy_movies[s, t, square_x_n, square_y_n] + noise_f*0.1 + + } + + # Shift the ground truth by 1 + square_x_s <- (x_shift[t+1] - w):(x_shift[t+1] + w) + square_y_s <- (y_shift[t+1] - w):(y_shift[t+1] + w) + + shifted_movies[s, t, square_x_s, square_y_s] <- + shifted_movies[s, t, square_x_s, square_y_s] + 1 + } + } + } + + # Cut to a 40x40 window + noisy_movies <- noisy_movies[,,21:60, 21:60] + shifted_movies = shifted_movies[,,21:60, 21:60] + + noisy_movies[noisy_movies > 1] <- 1 + shifted_movies[shifted_movies > 1] <- 1 + + # add channel dimension + dim(noisy_movies) <- c(dim(noisy_movies), 1) + dim(shifted_movies) <- c(dim(shifted_movies), 1) + + list( + noisy_movies = noisy_movies, + shifted_movies = shifted_movies + ) +} + + +# Data Preparation -------------------------------------------------------- + +# Artificial data generation: +# Generate movies with 3 to 7 moving squares inside. +# The squares are of shape 1x1 or 2x2 pixels, +# which move linearly over time. +# For convenience we first create movies with bigger width and height (80x80) +# and at the end we select a 40x40 window. +movies <- generate_movies(n_samples = 1000, n_frames = 15) +more_movies <- generate_movies(n_samples = 200, n_frames = 15) + + +# Model definition -------------------------------------------------------- + +model <- keras_model_sequential() + +model %>% + layer_conv_lstm_2d( + input_shape = list(NULL,40,40,1), + filters = 40, kernel_size = c(3,3), + padding = "same", + return_sequences = TRUE + ) %>% + layer_batch_normalization() %>% + + layer_conv_lstm_2d( + filters = 40, kernel_size = c(3,3), + padding = "same", return_sequences = TRUE + ) %>% + layer_batch_normalization() %>% + + layer_conv_lstm_2d( + filters = 40, kernel_size = c(3,3), + padding = "same", return_sequences = TRUE + ) %>% + layer_batch_normalization() %>% + + layer_conv_lstm_2d( + filters = 40, kernel_size = c(3,3), + padding = "same", return_sequences = TRUE + ) %>% + layer_batch_normalization() %>% + + layer_conv_3d( + filters = 1, kernel_size = c(3,3,3), + activation = "sigmoid", + padding = "same", data_format ="channels_last" + ) + +model %>% compile( + loss = "binary_crossentropy", + optimizer = "adadelta" +) + +model + + +# Training ---------------------------------------------------------------- + +model %>% fit( + movies$noisy_movies, + movies$shifted_movies, + batch_size = 10, + epochs = 30, + validation_split = 0.05 +) + +# Visualization ---------------------------------------------------------------- +# Testing the network on one movie +# feed it with the first 7 positions and then +# predict the new positions + +which <- 100 #Example to visualize on + +track <- more_movies$noisy_movies[which,1:8,,,1] +track <- array(track, c(1,8,40,40,1)) +for (k in 1:15){ +if (k<8){ + png(paste0(k,'_animate.png')) + par(mfrow=c(1,2),bg = 'white') + (more_movies$noisy_movies[which,k,,,1]) %>% raster() %>% plot() %>% title (main=paste0('Ground_',k)) + (more_movies$noisy_movies[which,k,,,1]) %>% raster() %>% plot() %>% title (main=paste0('Ground_',k)) + dev.off() +} else { + # And then compare the predictions + # to the ground truth + png(paste0(k,'_animate.png')) + par(mfrow=c(1,2),bg = 'white') + (more_movies$noisy_movies[which,k,,,1]) %>% raster() %>% plot() %>% title (main=paste0('Ground_',k)) + + new_pos <- model %>% predict(track) #Make Prediction + new_pos_loc <- new_pos[1,k,1:40,1:40,1] #Slice the last row + new_pos_loc %>% raster() %>% plot() %>% title (main=paste0('Pred_',k)) + + new_pos <- array(new_pos_loc, c(1,1, 40,40,1)) #Reshape it + track <- abind(track,new_pos,along = 2) #Bind it to the earlier data + dev.off() +} +} +# you can also create a gif by running +system("convert -delay 40 *.png animation.gif") diff --git a/website/articles/examples/conv_lstm.html b/website/articles/examples/conv_lstm.html new file mode 100644 index 000000000..1f320b353 --- /dev/null +++ b/website/articles/examples/conv_lstm.html @@ -0,0 +1,323 @@ + + + + + + + +conv_lstm • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +
# This script demonstrates the use of a convolutional LSTM network.
+# This network is used to predict the next frame of an artificially
+# generated movie which contains moving squares.
+library(keras)
+library(abind)
+library(raster)
+
+# Function Definition -----------------------------------------------------
+
+generate_movies <- function(n_samples = 1200, n_frames = 15){
+  
+  rows <- 80
+  cols <- 80
+  
+  noisy_movies <- array(0, dim = c(n_samples, n_frames, rows, cols))
+  shifted_movies <- array(0, dim = c(n_samples, n_frames, rows, cols))
+  
+  n <- sample(3:8, 1)
+  
+  for(s in 1:n_samples){
+    for(i in 1:n){
+      # Initial position
+      xstart <- sample(20:60, 1)
+      ystart <- sample(20:60, 1)
+      
+      # Direction of motion
+      directionx <- sample(-1:1, 1)
+      directiony <- sample(-1:1, 1)
+      
+      # Size of the square
+      w <- sample(2:3, 1)
+      
+      x_shift <- xstart + directionx*(0:(n_frames))
+      y_shift <- ystart + directiony*(0:(n_frames))
+      
+      for(t in 1:n_frames){
+        square_x <- (x_shift[t] - w):(x_shift[t] + w)
+        square_y <- (y_shift[t] - w):(y_shift[t] + w)
+        
+        noisy_movies[s, t, square_x, square_y] <- 
+          noisy_movies[s, t, square_x, square_y] + 1
+        
+        # Make it more robust by adding noise.
+        # The idea is that if during inference,
+        # the value of the pixel is not exactly one,
+        # we need to train the network to be robust and still
+        # consider it as a pixel belonging to a square.
+        if(runif(1) > 0.5){
+          noise_f <- sample(c(-1, 1), 1)
+          
+          square_x_n <- (x_shift[t] - w - 1):(x_shift[t] + w + 1)
+          square_y_n <- (y_shift[t] - w - 1):(y_shift[t] + w + 1)
+          
+          noisy_movies[s, t, square_x_n, square_y_n] <- 
+            noisy_movies[s, t, square_x_n, square_y_n] + noise_f*0.1
+          
+        }
+        
+        # Shift the ground truth by 1
+        square_x_s <- (x_shift[t+1] - w):(x_shift[t+1] + w)
+        square_y_s <- (y_shift[t+1] - w):(y_shift[t+1] + w)
+        
+        shifted_movies[s, t, square_x_s, square_y_s] <- 
+          shifted_movies[s, t, square_x_s, square_y_s] + 1
+      }
+    }  
+  }
+  
+  # Cut to a 40x40 window
+  noisy_movies <- noisy_movies[,,21:60, 21:60]
+  shifted_movies = shifted_movies[,,21:60, 21:60]
+  
+  noisy_movies[noisy_movies > 1] <- 1
+  shifted_movies[shifted_movies > 1] <- 1
+
+  # add channel dimension
+  dim(noisy_movies) <- c(dim(noisy_movies), 1)
+  dim(shifted_movies) <- c(dim(shifted_movies), 1)
+  
+  list(
+    noisy_movies = noisy_movies,
+    shifted_movies = shifted_movies
+  )
+}
+
+
+# Data Preparation --------------------------------------------------------
+
+# Artificial data generation:
+# Generate movies with 3 to 7 moving squares inside.
+# The squares are of shape 1x1 or 2x2 pixels,
+# which move linearly over time.
+# For convenience we first create movies with bigger width and height (80x80)
+# and at the end we select a 40x40 window.
+movies <- generate_movies(n_samples = 1000, n_frames = 15)
+more_movies <- generate_movies(n_samples = 200, n_frames = 15)
+
+
+# Model definition --------------------------------------------------------
+
+model <- keras_model_sequential()
+
+model %>%
+  layer_conv_lstm_2d(
+    input_shape = list(NULL,40,40,1), 
+    filters = 40, kernel_size = c(3,3),
+    padding = "same", 
+    return_sequences = TRUE
+  ) %>%
+  layer_batch_normalization() %>%
+  
+  layer_conv_lstm_2d(
+    filters = 40, kernel_size = c(3,3),
+    padding = "same", return_sequences = TRUE
+  ) %>%
+  layer_batch_normalization() %>%
+  
+  layer_conv_lstm_2d(
+    filters = 40, kernel_size = c(3,3),
+    padding = "same", return_sequences = TRUE
+  ) %>%
+  layer_batch_normalization() %>%
+  
+  layer_conv_lstm_2d(
+    filters = 40, kernel_size = c(3,3),
+    padding = "same", return_sequences = TRUE
+  ) %>%
+  layer_batch_normalization() %>%
+  
+  layer_conv_3d(
+    filters = 1, kernel_size = c(3,3,3),
+    activation = "sigmoid", 
+    padding = "same", data_format ="channels_last"
+  )
+
+model %>% compile(
+  loss = "binary_crossentropy", 
+  optimizer = "adadelta"
+)
+
+model
+
+
+# Training ----------------------------------------------------------------
+
+model %>% fit(
+  movies$noisy_movies,
+  movies$shifted_movies,
+  batch_size = 10,
+  epochs = 30, 
+  validation_split = 0.05
+)
+
+# Visualization  ----------------------------------------------------------------
+# Testing the network on one movie
+# feed it with the first 7 positions and then
+# predict the new positions
+
+which <- 100 #Example to visualize on
+
+track <- more_movies$noisy_movies[which,1:8,,,1]
+track <- array(track, c(1,8,40,40,1))
+for (k in 1:15){
+if (k<8){
+  png(paste0(k,'_animate.png'))
+  par(mfrow=c(1,2),bg = 'white')
+  (more_movies$noisy_movies[which,k,,,1])  %>% raster() %>% plot() %>% title (main=paste0('Ground_',k)) 
+  (more_movies$noisy_movies[which,k,,,1])  %>% raster() %>% plot() %>% title (main=paste0('Ground_',k)) 
+  dev.off()
+}  else {
+  # And then compare the predictions
+  # to the ground truth
+  png(paste0(k,'_animate.png'))
+  par(mfrow=c(1,2),bg = 'white')
+  (more_movies$noisy_movies[which,k,,,1])  %>% raster() %>% plot() %>% title (main=paste0('Ground_',k))
+   
+  new_pos <- model %>% predict(track)  #Make Prediction
+  new_pos_loc <- new_pos[1,k,1:40,1:40,1]  #Slice the last row
+  new_pos_loc  %>% raster() %>% plot() %>% title (main=paste0('Pred_',k))
+  
+  new_pos <- array(new_pos_loc, c(1,1, 40,40,1)) #Reshape it
+  track <- abind(track,new_pos,along = 2)  #Bind it to the earlier data
+  dev.off()
+}
+}  
+# you can also create a gif by running
+system("convert -delay 40 *.png animation.gif")
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/deep_dream.R b/website/articles/examples/deep_dream.R new file mode 100644 index 000000000..c89108220 --- /dev/null +++ b/website/articles/examples/deep_dream.R @@ -0,0 +1,205 @@ +#' Deep Dreaming in Keras. +#' +#' It is preferable to run this script on GPU, for speed. +#' +#' Example results: http://i.imgur.com/FX6ROg9.jpg +#' + +library(keras) +library(tensorflow) +library(purrr) +library(R6) +K <- backend() + +# Function Definitions ---------------------------------------------------- + +preprocess_image <- function(image_path, height, width){ + image_load(image_path, target_size = c(height, width)) %>% + image_to_array() %>% + array(dim = c(1, dim(.))) %>% + imagenet_preprocess_input() +} + +deprocess_image <- function(x){ + x <- x[1,,,] + # Remove zero-center by mean pixel + x[,,1] <- x[,,1] + 103.939 + x[,,2] <- x[,,2] + 116.779 + x[,,3] <- x[,,3] + 123.68 + # 'BGR'->'RGB' + x <- x[,,c(3,2,1)] + # clip to interval 0, 255 + x[x > 255] <- 255 + x[x < 0] <- 0 + x[] <- as.integer(x)/255 + x +} + +# calculates the total variation loss +# https://en.wikipedia.org/wiki/Total_variation_denoising +total_variation_loss <- function(x, h, w){ + + y_ij <- x[,0:(h - 2L), 0:(w - 2L),] + y_i1j <- x[,1:(h - 1L), 0:(w - 2L),] + y_ij1 <- x[,0:(h - 2L), 1:(w - 1L),] + + a <- K$square(y_ij - y_i1j) + b <- K$square(y_ij - y_ij1) + K$sum(K$pow(a + b, 1.25)) +} + + +# Parameters -------------------------------------------------------- + +# some settings we found interesting +saved_settings = list( + bad_trip = list( + features = list( + block4_conv1 = 0.05, + block4_conv2 = 0.01, + block4_conv3 = 0.01 + ), + continuity = 0.1, + dream_l2 = 0.8, + jitter = 5 + ), + dreamy = list( + features = list( + block5_conv1 = 0.05, + block5_conv2 = 0.02 + ), + continuity = 0.1, + dream_l2 = 0.02, + jitter = 0 + ) +) + +# the settings we will use in this experiment +img_height <- 600L +img_width <- 600L +img_size <- c(img_height, img_width, 3) +settings <- saved_settings$dreamy +image <- preprocess_image("deep_dream.jpg", img_height, img_width) + +# Model definition -------------------------------------------------------- + +# this will contain our generated image +dream <- layer_input(batch_shape = c(1, img_size)) + +# build the VGG16 network with our placeholder +# the model will be loaded with pre-trained ImageNet weights +model <- application_vgg16(input_tensor = dream, weights = "imagenet", + include_top = FALSE) + + +# get the symbolic outputs of each "key" layer (we gave them unique names). +layer_dict <- model$layers +names(layer_dict) <- map_chr(layer_dict ,~.x$name) + +# define the loss +loss <- tf$Variable(0.0) +for(layer_name in names(settings$features)){ + # add the L2 norm of the features of a layer to the loss + coeff <- settings$features[[layer_name]] + x <- layer_dict[[layer_name]]$output + out_shape <- layer_dict[[layer_name]]$output_shape %>% unlist() + # we avoid border artifacts by only involving non-border pixels in the loss + loss <- loss - + coeff*K$sum(K$square(x[,3:(out_shape[2] - 2), 3:(out_shape[3] - 2),])) / + prod(out_shape[-1]) +} + +# add continuity loss (gives image local coherence, can result in an artful blur) +loss <- loss + settings$continuity* + total_variation_loss(x = dream, img_height, img_width)/ + prod(img_size) +# add image L2 norm to loss (prevents pixels from taking very high values, makes image darker) +loss <- loss + settings$dream_l2*K$sum(K$square(dream))/prod(img_size) + +# feel free to further modify the loss as you see fit, to achieve new effects... + +# compute the gradients of the dream wrt the loss +grads <- K$gradients(loss, dream)[[1]] + +f_outputs <- K$`function`(list(dream), list(loss,grads)) + +eval_loss_and_grads <- function(image){ + dim(image) <- c(1, img_size) + outs <- f_outputs(list(image)) + list( + loss_value = outs[[1]], + grad_values = as.numeric(outs[[2]]) + ) +} + +# Loss and gradients evaluator. +# +# This Evaluator class makes it possible +# to compute loss and gradients in one pass +# while retrieving them via two separate functions, +# "loss" and "grads". This is done because scipy.optimize +# requires separate functions for loss and gradients, +# but computing them separately would be inefficient. +Evaluator <- R6Class( + "Evaluator", + public = list( + + loss_value = NULL, + grad_values = NULL, + + initialize = function() { + self$loss_value <- NULL + self$grad_values <- NULL + }, + + loss = function(x){ + loss_and_grad <- eval_loss_and_grads(x) + self$loss_value <- loss_and_grad$loss_value + self$grad_values <- loss_and_grad$grad_values + self$loss_value + }, + + grads = function(x){ + grad_values <- self$grad_values + self$loss_value <- NULL + self$grad_values <- NULL + grad_values + } + + ) +) + +evaluator <- Evaluator$new() + +# Run optimization (L-BFGS) over the pixels of the generated image +# so as to minimize the loss +for(i in 1:5){ + + # add random jitter to initial image + random_jitter <- settings$jitter*2*(runif(prod(img_size)) - 0.5) %>% + array(dim = c(1, img_size)) + image <- image + random_jitter + + # Run L-BFGS + opt <- optim( + as.numeric(image), fn = evaluator$loss, gr = evaluator$grads, + method = "L-BFGS-B", + control = list(maxit = 2) + ) + + # Print loss value + print(opt$value) + + # decode the image + image <- opt$par + dim(image) <- c(1, img_size) + image <- image - random_jitter + + # plot + im <- deprocess_image(image) + plot(as.raster(im)) + +} + + + diff --git a/website/articles/examples/deep_dream.html b/website/articles/examples/deep_dream.html new file mode 100644 index 000000000..198ff4c44 --- /dev/null +++ b/website/articles/examples/deep_dream.html @@ -0,0 +1,334 @@ + + + + + + + +deep_dream • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Deep Dreaming in Keras.

+

It is preferable to run this script on GPU, for speed.

+

Example results: http://i.imgur.com/FX6ROg9.jpg

+
library(keras)
+library(tensorflow)
+library(purrr)
+library(R6)
+K <- backend()
+
+# Function Definitions ----------------------------------------------------
+
+preprocess_image <- function(image_path, height, width){
+  image_load(image_path, target_size = c(height, width)) %>%
+    image_to_array() %>%
+    array(dim = c(1, dim(.))) %>%
+    imagenet_preprocess_input()
+}
+
+deprocess_image <- function(x){
+  x <- x[1,,,]
+  # Remove zero-center by mean pixel
+  x[,,1] <- x[,,1] + 103.939
+  x[,,2] <- x[,,2] + 116.779
+  x[,,3] <- x[,,3] + 123.68
+  # 'BGR'->'RGB'
+  x <- x[,,c(3,2,1)]
+  # clip to interval 0, 255
+  x[x > 255] <- 255
+  x[x < 0] <- 0
+  x[] <- as.integer(x)/255
+  x
+}
+
+# calculates the total variation loss
+# https://en.wikipedia.org/wiki/Total_variation_denoising
+total_variation_loss <- function(x, h, w){
+  
+  y_ij  <- x[,0:(h - 2L), 0:(w - 2L),]
+  y_i1j <- x[,1:(h - 1L), 0:(w - 2L),]
+  y_ij1 <- x[,0:(h - 2L), 1:(w - 1L),]
+  
+  a <- K$square(y_ij - y_i1j)
+  b <- K$square(y_ij - y_ij1)
+  K$sum(K$pow(a + b, 1.25))
+}
+
+
+# Parameters --------------------------------------------------------
+
+# some settings we found interesting
+saved_settings = list(
+  bad_trip = list(
+    features = list(
+      block4_conv1 = 0.05,
+      block4_conv2 = 0.01,
+      block4_conv3 = 0.01
+    ),
+    continuity = 0.1,
+    dream_l2 = 0.8,
+    jitter =  5
+  ),
+  dreamy = list(
+    features = list(
+      block5_conv1 = 0.05,
+      block5_conv2 = 0.02
+    ),
+    continuity = 0.1,
+    dream_l2 = 0.02,
+    jitter = 0
+  )
+)
+
+# the settings we will use in this experiment
+img_height <- 600L
+img_width <- 600L
+img_size <- c(img_height, img_width, 3)
+settings <- saved_settings$dreamy
+image <- preprocess_image("deep_dream.jpg", img_height, img_width)
+
+# Model definition --------------------------------------------------------
+
+# this will contain our generated image
+dream <- layer_input(batch_shape = c(1, img_size))
+
+# build the VGG16 network with our placeholder
+# the model will be loaded with pre-trained ImageNet weights
+model <- application_vgg16(input_tensor = dream, weights = "imagenet",
+                           include_top = FALSE)
+
+
+# get the symbolic outputs of each "key" layer (we gave them unique names).
+layer_dict <- model$layers
+names(layer_dict) <- map_chr(layer_dict ,~.x$name)
+
+# define the loss
+loss <- tf$Variable(0.0)
+for(layer_name in names(settings$features)){
+  # add the L2 norm of the features of a layer to the loss
+  coeff <- settings$features[[layer_name]]
+  x <- layer_dict[[layer_name]]$output
+  out_shape <- layer_dict[[layer_name]]$output_shape %>% unlist()
+  # we avoid border artifacts by only involving non-border pixels in the loss
+  loss <- loss - 
+    coeff*K$sum(K$square(x[,3:(out_shape[2] - 2), 3:(out_shape[3] - 2),])) / 
+    prod(out_shape[-1])
+}
+
+# add continuity loss (gives image local coherence, can result in an artful blur)
+loss <- loss + settings$continuity*
+  total_variation_loss(x = dream, img_height, img_width)/
+  prod(img_size)
+# add image L2 norm to loss (prevents pixels from taking very high values, makes image darker)
+loss <- loss + settings$dream_l2*K$sum(K$square(dream))/prod(img_size)
+
+# feel free to further modify the loss as you see fit, to achieve new effects...
+
+# compute the gradients of the dream wrt the loss
+grads <- K$gradients(loss, dream)[[1]] 
+
+f_outputs <- K$`function`(list(dream), list(loss,grads))
+
+eval_loss_and_grads <- function(image){
+  dim(image) <- c(1, img_size)
+  outs <- f_outputs(list(image))
+  list(
+    loss_value = outs[[1]],
+    grad_values = as.numeric(outs[[2]])
+  )
+}
+
+# Loss and gradients evaluator.
+# 
+# This Evaluator class makes it possible
+# to compute loss and gradients in one pass
+# while retrieving them via two separate functions,
+# "loss" and "grads". This is done because scipy.optimize
+# requires separate functions for loss and gradients,
+# but computing them separately would be inefficient.
+Evaluator <- R6Class(
+  "Evaluator",
+  public = list(
+    
+    loss_value = NULL,
+    grad_values = NULL,
+    
+    initialize = function() {
+      self$loss_value <- NULL
+      self$grad_values <- NULL
+    },
+    
+    loss = function(x){
+      loss_and_grad <- eval_loss_and_grads(x)
+      self$loss_value <- loss_and_grad$loss_value
+      self$grad_values <- loss_and_grad$grad_values
+      self$loss_value
+    },
+    
+    grads = function(x){
+      grad_values <- self$grad_values
+      self$loss_value <- NULL
+      self$grad_values <- NULL
+      grad_values
+    }
+      
+  )
+)
+
+evaluator <- Evaluator$new()
+
+# Run optimization (L-BFGS) over the pixels of the generated image
+# so as to minimize the loss
+for(i in 1:5){
+  
+  # add random jitter to initial image
+  random_jitter <- settings$jitter*2*(runif(prod(img_size)) - 0.5) %>%
+    array(dim = c(1, img_size))
+  image <- image + random_jitter
+
+  # Run L-BFGS
+  opt <- optim(
+    as.numeric(image), fn = evaluator$loss, gr = evaluator$grads, 
+    method = "L-BFGS-B",
+    control = list(maxit = 2)
+    )
+  
+  # Print loss value
+  print(opt$value)
+  
+  # decode the image
+  image <- opt$par
+  dim(image) <- c(1, img_size)
+  image <- image - random_jitter
+
+  # plot
+  im <- deprocess_image(image)
+  plot(as.raster(im))
+  
+}
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/deep_dream.jpg b/website/articles/examples/deep_dream.jpg new file mode 100644 index 000000000..0f39f13f0 Binary files /dev/null and b/website/articles/examples/deep_dream.jpg differ diff --git a/website/articles/examples/image_ocr.R b/website/articles/examples/image_ocr.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/image_ocr.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/image_ocr.html b/website/articles/examples/image_ocr.html new file mode 100644 index 000000000..9bd76ccb3 --- /dev/null +++ b/website/articles/examples/image_ocr.html @@ -0,0 +1,137 @@ + + + + + + + +image_ocr • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/image_ocr.py b/website/articles/examples/image_ocr.py new file mode 100755 index 000000000..04d13ee14 --- /dev/null +++ b/website/articles/examples/image_ocr.py @@ -0,0 +1,491 @@ +'''This example uses a convolutional stack followed by a recurrent stack +and a CTC logloss function to perform optical character recognition +of generated text images. I have no evidence of whether it actually +learns general shapes of text, or just is able to recognize all +the different fonts thrown at it...the purpose is more to demonstrate CTC +inside of Keras. Note that the font list may need to be updated +for the particular OS in use. + +This starts off with 4 letter words. For the first 12 epochs, the +difficulty is gradually increased using the TextImageGenerator class +which is both a generator class for test/train data and a Keras +callback class. After 20 epochs, longer sequences are thrown at it +by recompiling the model to handle a wider image and rebuilding +the word list to include two words separated by a space. + +The table below shows normalized edit distance values. Theano uses +a slightly different CTC implementation, hence the different results. + + Norm. ED +Epoch | TF | TH +------------------------ + 10 0.027 0.064 + 15 0.038 0.035 + 20 0.043 0.045 + 25 0.014 0.019 + +This requires cairo and editdistance packages: +pip install cairocffi +pip install editdistance + +Created by Mike Henry +https://github.com/mbhenry/ +''' +import os +import itertools +import re +import datetime +import cairocffi as cairo +import editdistance +import numpy as np +from scipy import ndimage +import pylab +from keras import backend as K +from keras.layers.convolutional import Conv2D, MaxPooling2D +from keras.layers import Input, Dense, Activation +from keras.layers import Reshape, Lambda +from keras.layers.merge import add, concatenate +from keras.models import Model +from keras.layers.recurrent import GRU +from keras.optimizers import SGD +from keras.utils.data_utils import get_file +from keras.preprocessing import image +import keras.callbacks + + +OUTPUT_DIR = 'image_ocr' + +np.random.seed(55) + + +# this creates larger "blotches" of noise which look +# more realistic than just adding gaussian noise +# assumes greyscale with pixels ranging from 0 to 1 + +def speckle(img): + severity = np.random.uniform(0, 0.6) + blur = ndimage.gaussian_filter(np.random.randn(*img.shape) * severity, 1) + img_speck = (img + blur) + img_speck[img_speck > 1] = 1 + img_speck[img_speck <= 0] = 0 + return img_speck + + +# paints the string in a random location the bounding box +# also uses a random font, a slight random rotation, +# and a random amount of speckle noise + +def paint_text(text, w, h, rotate=False, ud=False, multi_fonts=False): + surface = cairo.ImageSurface(cairo.FORMAT_RGB24, w, h) + with cairo.Context(surface) as context: + context.set_source_rgb(1, 1, 1) # White + context.paint() + # this font list works in Centos 7 + if multi_fonts: + fonts = ['Century Schoolbook', 'Courier', 'STIX', 'URW Chancery L', 'FreeMono'] + context.select_font_face(np.random.choice(fonts), cairo.FONT_SLANT_NORMAL, + np.random.choice([cairo.FONT_WEIGHT_BOLD, cairo.FONT_WEIGHT_NORMAL])) + else: + context.select_font_face('Courier', cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_BOLD) + context.set_font_size(25) + box = context.text_extents(text) + border_w_h = (4, 4) + if box[2] > (w - 2 * border_w_h[1]) or box[3] > (h - 2 * border_w_h[0]): + raise IOError('Could not fit string into image. Max char count is too large for given image width.') + + # teach the RNN translational invariance by + # fitting text box randomly on canvas, with some room to rotate + max_shift_x = w - box[2] - border_w_h[0] + max_shift_y = h - box[3] - border_w_h[1] + top_left_x = np.random.randint(0, int(max_shift_x)) + if ud: + top_left_y = np.random.randint(0, int(max_shift_y)) + else: + top_left_y = h // 2 + context.move_to(top_left_x - int(box[0]), top_left_y - int(box[1])) + context.set_source_rgb(0, 0, 0) + context.show_text(text) + + buf = surface.get_data() + a = np.frombuffer(buf, np.uint8) + a.shape = (h, w, 4) + a = a[:, :, 0] # grab single channel + a = a.astype(np.float32) / 255 + a = np.expand_dims(a, 0) + if rotate: + a = image.random_rotation(a, 3 * (w - top_left_x) / w + 1) + a = speckle(a) + + return a + + +def shuffle_mats_or_lists(matrix_list, stop_ind=None): + ret = [] + assert all([len(i) == len(matrix_list[0]) for i in matrix_list]) + len_val = len(matrix_list[0]) + if stop_ind is None: + stop_ind = len_val + assert stop_ind <= len_val + + a = list(range(stop_ind)) + np.random.shuffle(a) + a += list(range(stop_ind, len_val)) + for mat in matrix_list: + if isinstance(mat, np.ndarray): + ret.append(mat[a]) + elif isinstance(mat, list): + ret.append([mat[i] for i in a]) + else: + raise TypeError('shuffle_mats_or_lists only supports ' + 'numpy.array and list objects') + return ret + + +def text_to_labels(text, num_classes): + ret = [] + for char in text: + if char >= 'a' and char <= 'z': + ret.append(ord(char) - ord('a')) + elif char == ' ': + ret.append(26) + return ret + + +# only a-z and space..probably not to difficult +# to expand to uppercase and symbols + +def is_valid_str(in_str): + search = re.compile(r'[^a-z\ ]').search + return not bool(search(in_str)) + + +# Uses generator functions to supply train/test with +# data. Image renderings are text are created on the fly +# each time with random perturbations + +class TextImageGenerator(keras.callbacks.Callback): + + def __init__(self, monogram_file, bigram_file, minibatch_size, + img_w, img_h, downsample_factor, val_split, + absolute_max_string_len=16): + + self.minibatch_size = minibatch_size + self.img_w = img_w + self.img_h = img_h + self.monogram_file = monogram_file + self.bigram_file = bigram_file + self.downsample_factor = downsample_factor + self.val_split = val_split + self.blank_label = self.get_output_size() - 1 + self.absolute_max_string_len = absolute_max_string_len + + def get_output_size(self): + return 28 + + # num_words can be independent of the epoch size due to the use of generators + # as max_string_len grows, num_words can grow + def build_word_list(self, num_words, max_string_len=None, mono_fraction=0.5): + assert max_string_len <= self.absolute_max_string_len + assert num_words % self.minibatch_size == 0 + assert (self.val_split * num_words) % self.minibatch_size == 0 + self.num_words = num_words + self.string_list = [''] * self.num_words + tmp_string_list = [] + self.max_string_len = max_string_len + self.Y_data = np.ones([self.num_words, self.absolute_max_string_len]) * -1 + self.X_text = [] + self.Y_len = [0] * self.num_words + + # monogram file is sorted by frequency in english speech + with open(self.monogram_file, 'rt') as f: + for line in f: + if len(tmp_string_list) == int(self.num_words * mono_fraction): + break + word = line.rstrip() + if max_string_len == -1 or max_string_len is None or len(word) <= max_string_len: + tmp_string_list.append(word) + + # bigram file contains common word pairings in english speech + with open(self.bigram_file, 'rt') as f: + lines = f.readlines() + for line in lines: + if len(tmp_string_list) == self.num_words: + break + columns = line.lower().split() + word = columns[0] + ' ' + columns[1] + if is_valid_str(word) and \ + (max_string_len == -1 or max_string_len is None or len(word) <= max_string_len): + tmp_string_list.append(word) + if len(tmp_string_list) != self.num_words: + raise IOError('Could not pull enough words from supplied monogram and bigram files. ') + # interlace to mix up the easy and hard words + self.string_list[::2] = tmp_string_list[:self.num_words // 2] + self.string_list[1::2] = tmp_string_list[self.num_words // 2:] + + for i, word in enumerate(self.string_list): + self.Y_len[i] = len(word) + self.Y_data[i, 0:len(word)] = text_to_labels(word, self.get_output_size()) + self.X_text.append(word) + self.Y_len = np.expand_dims(np.array(self.Y_len), 1) + + self.cur_val_index = self.val_split + self.cur_train_index = 0 + + # each time an image is requested from train/val/test, a new random + # painting of the text is performed + def get_batch(self, index, size, train): + # width and height are backwards from typical Keras convention + # because width is the time dimension when it gets fed into the RNN + if K.image_data_format() == 'channels_first': + X_data = np.ones([size, 1, self.img_w, self.img_h]) + else: + X_data = np.ones([size, self.img_w, self.img_h, 1]) + + labels = np.ones([size, self.absolute_max_string_len]) + input_length = np.zeros([size, 1]) + label_length = np.zeros([size, 1]) + source_str = [] + for i in range(0, size): + # Mix in some blank inputs. This seems to be important for + # achieving translational invariance + if train and i > size - 4: + if K.image_data_format() == 'channels_first': + X_data[i, 0, 0:self.img_w, :] = self.paint_func('')[0, :, :].T + else: + X_data[i, 0:self.img_w, :, 0] = self.paint_func('',)[0, :, :].T + labels[i, 0] = self.blank_label + input_length[i] = self.img_w // self.downsample_factor - 2 + label_length[i] = 1 + source_str.append('') + else: + if K.image_data_format() == 'channels_first': + X_data[i, 0, 0:self.img_w, :] = self.paint_func(self.X_text[index + i])[0, :, :].T + else: + X_data[i, 0:self.img_w, :, 0] = self.paint_func(self.X_text[index + i])[0, :, :].T + labels[i, :] = self.Y_data[index + i] + input_length[i] = self.img_w // self.downsample_factor - 2 + label_length[i] = self.Y_len[index + i] + source_str.append(self.X_text[index + i]) + inputs = {'the_input': X_data, + 'the_labels': labels, + 'input_length': input_length, + 'label_length': label_length, + 'source_str': source_str # used for visualization only + } + outputs = {'ctc': np.zeros([size])} # dummy data for dummy loss function + return (inputs, outputs) + + def next_train(self): + while 1: + ret = self.get_batch(self.cur_train_index, self.minibatch_size, train=True) + self.cur_train_index += self.minibatch_size + if self.cur_train_index >= self.val_split: + self.cur_train_index = self.cur_train_index % 32 + (self.X_text, self.Y_data, self.Y_len) = shuffle_mats_or_lists( + [self.X_text, self.Y_data, self.Y_len], self.val_split) + yield ret + + def next_val(self): + while 1: + ret = self.get_batch(self.cur_val_index, self.minibatch_size, train=False) + self.cur_val_index += self.minibatch_size + if self.cur_val_index >= self.num_words: + self.cur_val_index = self.val_split + self.cur_val_index % 32 + yield ret + + def on_train_begin(self, logs={}): + self.build_word_list(16000, 4, 1) + self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h, + rotate=False, ud=False, multi_fonts=False) + + def on_epoch_begin(self, epoch, logs={}): + # rebind the paint function to implement curriculum learning + if epoch >= 3 and epoch < 6: + self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h, + rotate=False, ud=True, multi_fonts=False) + elif epoch >= 6 and epoch < 9: + self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h, + rotate=False, ud=True, multi_fonts=True) + elif epoch >= 9: + self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h, + rotate=True, ud=True, multi_fonts=True) + if epoch >= 21 and self.max_string_len < 12: + self.build_word_list(32000, 12, 0.5) + + +# the actual loss calc occurs here despite it not being +# an internal Keras loss function + +def ctc_lambda_func(args): + y_pred, labels, input_length, label_length = args + # the 2 is critical here since the first couple outputs of the RNN + # tend to be garbage: + y_pred = y_pred[:, 2:, :] + return K.ctc_batch_cost(labels, y_pred, input_length, label_length) + + +# For a real OCR application, this should be beam search with a dictionary +# and language model. For this example, best path is sufficient. + +def decode_batch(test_func, word_batch): + out = test_func([word_batch])[0] + ret = [] + for j in range(out.shape[0]): + out_best = list(np.argmax(out[j, 2:], 1)) + out_best = [k for k, g in itertools.groupby(out_best)] + # 26 is space, 27 is CTC blank char + outstr = '' + for c in out_best: + if c >= 0 and c < 26: + outstr += chr(c + ord('a')) + elif c == 26: + outstr += ' ' + ret.append(outstr) + return ret + + +class VizCallback(keras.callbacks.Callback): + + def __init__(self, run_name, test_func, text_img_gen, num_display_words=6): + self.test_func = test_func + self.output_dir = os.path.join( + OUTPUT_DIR, run_name) + self.text_img_gen = text_img_gen + self.num_display_words = num_display_words + if not os.path.exists(self.output_dir): + os.makedirs(self.output_dir) + + def show_edit_distance(self, num): + num_left = num + mean_norm_ed = 0.0 + mean_ed = 0.0 + while num_left > 0: + word_batch = next(self.text_img_gen)[0] + num_proc = min(word_batch['the_input'].shape[0], num_left) + decoded_res = decode_batch(self.test_func, word_batch['the_input'][0:num_proc]) + for j in range(0, num_proc): + edit_dist = editdistance.eval(decoded_res[j], word_batch['source_str'][j]) + mean_ed += float(edit_dist) + mean_norm_ed += float(edit_dist) / len(word_batch['source_str'][j]) + num_left -= num_proc + mean_norm_ed = mean_norm_ed / num + mean_ed = mean_ed / num + print('\nOut of %d samples: Mean edit distance: %.3f Mean normalized edit distance: %0.3f' + % (num, mean_ed, mean_norm_ed)) + + def on_epoch_end(self, epoch, logs={}): + self.model.save_weights(os.path.join(self.output_dir, 'weights%02d.h5' % (epoch))) + self.show_edit_distance(256) + word_batch = next(self.text_img_gen)[0] + res = decode_batch(self.test_func, word_batch['the_input'][0:self.num_display_words]) + if word_batch['the_input'][0].shape[0] < 256: + cols = 2 + else: + cols = 1 + for i in range(self.num_display_words): + pylab.subplot(self.num_display_words // cols, cols, i + 1) + if K.image_data_format() == 'channels_first': + the_input = word_batch['the_input'][i, 0, :, :] + else: + the_input = word_batch['the_input'][i, :, :, 0] + pylab.imshow(the_input.T, cmap='Greys_r') + pylab.xlabel('Truth = \'%s\'\nDecoded = \'%s\'' % (word_batch['source_str'][i], res[i])) + fig = pylab.gcf() + fig.set_size_inches(10, 13) + pylab.savefig(os.path.join(self.output_dir, 'e%02d.png' % (epoch))) + pylab.close() + + +def train(run_name, start_epoch, stop_epoch, img_w): + # Input Parameters + img_h = 64 + words_per_epoch = 16000 + val_split = 0.2 + val_words = int(words_per_epoch * (val_split)) + + # Network parameters + conv_filters = 16 + kernel_size = (3, 3) + pool_size = 2 + time_dense_size = 32 + rnn_size = 512 + + if K.image_data_format() == 'channels_first': + input_shape = (1, img_w, img_h) + else: + input_shape = (img_w, img_h, 1) + + fdir = os.path.dirname(get_file('wordlists.tgz', + origin='http://www.mythic-ai.com/datasets/wordlists.tgz', untar=True)) + + img_gen = TextImageGenerator(monogram_file=os.path.join(fdir, 'wordlist_mono_clean.txt'), + bigram_file=os.path.join(fdir, 'wordlist_bi_clean.txt'), + minibatch_size=32, + img_w=img_w, + img_h=img_h, + downsample_factor=(pool_size ** 2), + val_split=words_per_epoch - val_words + ) + act = 'relu' + input_data = Input(name='the_input', shape=input_shape, dtype='float32') + inner = Conv2D(conv_filters, kernel_size, padding='same', + activation=act, kernel_initializer='he_normal', + name='conv1')(input_data) + inner = MaxPooling2D(pool_size=(pool_size, pool_size), name='max1')(inner) + inner = Conv2D(conv_filters, kernel_size, padding='same', + activation=act, kernel_initializer='he_normal', + name='conv2')(inner) + inner = MaxPooling2D(pool_size=(pool_size, pool_size), name='max2')(inner) + + conv_to_rnn_dims = (img_w // (pool_size ** 2), (img_h // (pool_size ** 2)) * conv_filters) + inner = Reshape(target_shape=conv_to_rnn_dims, name='reshape')(inner) + + # cuts down input size going into RNN: + inner = Dense(time_dense_size, activation=act, name='dense1')(inner) + + # Two layers of bidirecitonal GRUs + # GRU seems to work as well, if not better than LSTM: + gru_1 = GRU(rnn_size, return_sequences=True, kernel_initializer='he_normal', name='gru1')(inner) + gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, kernel_initializer='he_normal', name='gru1_b')(inner) + gru1_merged = add([gru_1, gru_1b]) + gru_2 = GRU(rnn_size, return_sequences=True, kernel_initializer='he_normal', name='gru2')(gru1_merged) + gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True, kernel_initializer='he_normal', name='gru2_b')(gru1_merged) + + # transforms RNN output to character activations: + inner = Dense(img_gen.get_output_size(), kernel_initializer='he_normal', + name='dense2')(concatenate([gru_2, gru_2b])) + y_pred = Activation('softmax', name='softmax')(inner) + Model(inputs=input_data, outputs=y_pred).summary() + + labels = Input(name='the_labels', shape=[img_gen.absolute_max_string_len], dtype='float32') + input_length = Input(name='input_length', shape=[1], dtype='int64') + label_length = Input(name='label_length', shape=[1], dtype='int64') + # Keras doesn't currently support loss funcs with extra parameters + # so CTC loss is implemented in a lambda layer + loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length]) + + # clipnorm seems to speeds up convergence + sgd = SGD(lr=0.02, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5) + + model = Model(inputs=[input_data, labels, input_length, label_length], outputs=loss_out) + + # the loss calc occurs elsewhere, so use a dummy lambda func for the loss + model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd) + if start_epoch > 0: + weight_file = os.path.join(OUTPUT_DIR, os.path.join(run_name, 'weights%02d.h5' % (start_epoch - 1))) + model.load_weights(weight_file) + # captures output of softmax so we can decode the output during visualization + test_func = K.function([input_data], [y_pred]) + + viz_cb = VizCallback(run_name, test_func, img_gen.next_val()) + + model.fit_generator(generator=img_gen.next_train(), steps_per_epoch=(words_per_epoch - val_words), + epochs=stop_epoch, validation_data=img_gen.next_val(), validation_steps=val_words, + callbacks=[viz_cb, img_gen], initial_epoch=start_epoch) + + +if __name__ == '__main__': + run_name = datetime.datetime.now().strftime('%Y:%m:%d:%H:%M:%S') + train(run_name, 0, 20, 128) + # increase to wider images and start at epoch 20. The learned weights are reloaded + train(run_name, 20, 25, 512) diff --git a/website/articles/examples/imdb_bidirectional_lstm.R b/website/articles/examples/imdb_bidirectional_lstm.R new file mode 100644 index 000000000..9090eff45 --- /dev/null +++ b/website/articles/examples/imdb_bidirectional_lstm.R @@ -0,0 +1,54 @@ +#' Train a Bidirectional LSTM on the IMDB sentiment classification task. +#' +#' Output after 4 epochs on CPU: ~0.8146 +#' Time per epoch on CPU (Core i7): ~150s. + +library(keras) + +max_features <- 20000 + +# cut texts after this number of words +# (among top max_features most common words) +maxlen <- 100 + +batch_size <- 32 + +cat('Loading data...\n') +imdb <- dataset_imdb(num_words = max_features) +x_train <- imdb$train$x +y_train <- imdb$train$y +x_test <- imdb$test$x +y_test <- imdb$test$y + +cat(length(x_train), 'train sequences\n') +cat(length(x_test), 'test sequences\n') + +cat('Pad sequences (samples x time)\n') +x_train <- pad_sequences(x_train, maxlen = maxlen) +x_test <- pad_sequences(x_test, maxlen = maxlen) +cat('x_train shape:', dim(x_train), '\n') +cat('x_test shape:', dim(x_test), '\n') + +model <- keras_model_sequential() +model %>% + layer_embedding(input_dim = max_features, output_dim = 128, input_length = maxlen) %>% + bidirectional(layer_lstm(units = 64)) %>% + layer_dropout(rate = 0.5) %>% + layer_dense(units = 1, activation = 'sigmoid') + +# try using different optimizers and different optimizer configs +model %>% compile( + loss = 'binary_crossentropy', + optimizer = 'adam', + metrics = c('accuracy') +) + +cat('Train...\n') +model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = 4, + validation_data = list(x_test, y_test) +) + + diff --git a/website/articles/examples/imdb_bidirectional_lstm.html b/website/articles/examples/imdb_bidirectional_lstm.html new file mode 100644 index 000000000..72ed07963 --- /dev/null +++ b/website/articles/examples/imdb_bidirectional_lstm.html @@ -0,0 +1,185 @@ + + + + + + + +imdb_bidirectional_lstm • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Train a Bidirectional LSTM on the IMDB sentiment classification task.

+

Output after 4 epochs on CPU: ~0.8146 Time per epoch on CPU (Core i7): ~150s.

+
library(keras)
+
+max_features <- 20000
+
+# cut texts after this number of words
+# (among top max_features most common words)
+maxlen <- 100
+
+batch_size <- 32
+
+cat('Loading data...\n')
+imdb <- dataset_imdb(num_words = max_features)
+x_train <- imdb$train$x
+y_train <- imdb$train$y
+x_test <- imdb$test$x
+y_test <- imdb$test$y
+
+cat(length(x_train), 'train sequences\n')
+cat(length(x_test), 'test sequences\n')
+
+cat('Pad sequences (samples x time)\n')
+x_train <- pad_sequences(x_train, maxlen = maxlen)
+x_test <- pad_sequences(x_test, maxlen = maxlen)
+cat('x_train shape:', dim(x_train), '\n')
+cat('x_test shape:', dim(x_test), '\n')
+
+model <- keras_model_sequential()
+model %>%
+  layer_embedding(input_dim = max_features, output_dim = 128, input_length = maxlen) %>% 
+  bidirectional(layer_lstm(units = 64)) %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 1, activation = 'sigmoid')
+
+# try using different optimizers and different optimizer configs
+model %>% compile(
+  loss = 'binary_crossentropy',
+  optimizer = 'adam',
+  metrics = c('accuracy')
+)
+
+cat('Train...\n')
+model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = 4,
+  validation_data = list(x_test, y_test)
+)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/imdb_cnn.R b/website/articles/examples/imdb_cnn.R new file mode 100644 index 000000000..81984f976 --- /dev/null +++ b/website/articles/examples/imdb_cnn.R @@ -0,0 +1,93 @@ +#' This example demonstrates the use of Convolution1D for text classification. +#' +#' Gets to 0.89 test accuracy after 2 epochs. +#' 90s/epoch on Intel i5 2.4Ghz CPU. +#' 10s/epoch on Tesla K40 GPU. +#' + +library(keras) + +# set parameters: +max_features <- 5000 +maxlen <- 400 +batch_size <- 32 +embedding_dims <- 50 +filters <- 250 +kernel_size <- 3 +hidden_dims <- 250 +epochs <- 2 + + +# Data Preparation -------------------------------------------------------- + +# Keras load all data into a list with the following structure: +# List of 2 +# $ train:List of 2 +# ..$ x:List of 25000 +# .. .. [list output truncated] +# .. ..- attr(*, "dim")= int 25000 +# ..$ y: num [1:25000(1d)] 1 0 0 1 0 0 1 0 1 0 ... +# $ test :List of 2 +# ..$ x:List of 25000 +# .. .. [list output truncated] +# .. ..- attr(*, "dim")= int 25000 +# ..$ y: num [1:25000(1d)] 1 1 1 1 1 0 0 0 1 1 ... +# +# The x data includes integer sequences, each integer is a word. +# The y data includes a set of integer labels (0 or 1). +# The num_words argument indicates that only the max_fetures most frequent +# words will be integerized. All other will be ignored. +# See help(dataset_imdb) +imdb <- dataset_imdb(num_words = max_features) + +# pad the sequences, so they have all the same lenght +# this will conver our dataset into a matrix: each line is a review +# and each column a word on the sequence. +# we pad the sequences with 0 to the left. +x_train <- imdb$train$x %>% + pad_sequences(maxlen = maxlen) + +x_test <- imdb$test$x %>% + pad_sequences(maxlen = maxlen) + +# Defining the model ------------------------------------------------------ + +model <- keras_model_sequential() + +model %>% + # we start off with an efficient embedding layer which maps + # our vocab indices into embedding_dims dimensions + layer_embedding(max_features, embedding_dims, input_length = maxlen) %>% + layer_dropout(0.2) %>% + # we add a Convolution1D, which will learn filters + # word group filters of size filter_length: + layer_conv_1d( + filters, kernel_size, + padding = "valid", activation = "relu", strides = 1 + ) %>% + # we use max pooling: + layer_global_max_pooling_1d() %>% + # We add a vanilla hidden layer: + layer_dense(hidden_dims) %>% + layer_dropout(0.2) %>% + layer_activation("relu") %>% + # We project onto a single unit output layer, and squash it with a sigmoid: + layer_dense(1) %>% + layer_activation("sigmoid") + + +model %>% compile( + loss = "binary_crossentropy", + optimizer = "adam", + metrics = "accuracy" +) + +# Training ---------------------------------------------------------------- + +model %>% + fit( + x_train, imdb$train$y, + batch_size = batch_size, + epochs = epochs, + validation_data = list(x_test, imdb$test$y) + ) diff --git a/website/articles/examples/imdb_cnn.html b/website/articles/examples/imdb_cnn.html new file mode 100644 index 000000000..a7bf34ab4 --- /dev/null +++ b/website/articles/examples/imdb_cnn.html @@ -0,0 +1,224 @@ + + + + + + + +imdb_cnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This example demonstrates the use of Convolution1D for text classification.

+

Gets to 0.89 test accuracy after 2 epochs. 90s/epoch on Intel i5 2.4Ghz CPU. 10s/epoch on Tesla K40 GPU.

+
library(keras)
+
+# set parameters:
+max_features <- 5000
+maxlen <- 400
+batch_size <- 32
+embedding_dims <- 50
+filters <- 250
+kernel_size <- 3
+hidden_dims <- 250
+epochs <- 2
+
+
+# Data Preparation --------------------------------------------------------
+
+# Keras load all data into a list with the following structure:
+# List of 2
+# $ train:List of 2
+# ..$ x:List of 25000
+# .. .. [list output truncated]
+# .. ..- attr(*, "dim")= int 25000
+# ..$ y: num [1:25000(1d)] 1 0 0 1 0 0 1 0 1 0 ...
+# $ test :List of 2
+# ..$ x:List of 25000
+# .. .. [list output truncated]
+# .. ..- attr(*, "dim")= int 25000
+# ..$ y: num [1:25000(1d)] 1 1 1 1 1 0 0 0 1 1 ...
+#
+# The x data includes integer sequences, each integer is a word.
+# The y data includes a set of integer labels (0 or 1).
+# The num_words argument indicates that only the max_fetures most frequent
+# words will be integerized. All other will be ignored.
+# See help(dataset_imdb)
+imdb <- dataset_imdb(num_words = max_features)
+
+# pad the sequences, so they have all the same lenght
+# this will conver our dataset into a matrix: each line is a review
+# and each column a word on the sequence. 
+# we pad the sequences with 0 to the left.
+x_train <- imdb$train$x %>%
+  pad_sequences(maxlen = maxlen)
+
+x_test <- imdb$test$x %>%
+  pad_sequences(maxlen = maxlen)
+
+# Defining the model ------------------------------------------------------
+
+model <- keras_model_sequential()
+
+model %>% 
+  # we start off with an efficient embedding layer which maps
+  # our vocab indices into embedding_dims dimensions
+  layer_embedding(max_features, embedding_dims, input_length = maxlen) %>%
+  layer_dropout(0.2) %>%
+  # we add a Convolution1D, which will learn filters
+  # word group filters of size filter_length:
+  layer_conv_1d(
+    filters, kernel_size, 
+    padding = "valid", activation = "relu", strides = 1
+  ) %>%
+  # we use max pooling:
+  layer_global_max_pooling_1d() %>%
+  # We add a vanilla hidden layer:
+  layer_dense(hidden_dims) %>%
+  layer_dropout(0.2) %>%
+  layer_activation("relu") %>%
+  # We project onto a single unit output layer, and squash it with a sigmoid:
+  layer_dense(1) %>%
+  layer_activation("sigmoid")
+
+
+model %>% compile(
+  loss = "binary_crossentropy",
+  optimizer = "adam",
+  metrics = "accuracy"
+)
+
+# Training ----------------------------------------------------------------
+
+model %>%
+  fit(
+    x_train, imdb$train$y,
+    batch_size = batch_size,
+    epochs = epochs,
+    validation_data = list(x_test, imdb$test$y)
+  )
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/imdb_cnn_lstm.R b/website/articles/examples/imdb_cnn_lstm.R new file mode 100644 index 000000000..f5bef9ae1 --- /dev/null +++ b/website/articles/examples/imdb_cnn_lstm.R @@ -0,0 +1,91 @@ +#' Train a recurrent convolutional network on the IMDB sentiment +#' classification task. +#' +#' Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU. + +library(keras) + +# Parameters -------------------------------------------------------------- + +# Embedding +max_features = 20000 +maxlen = 100 +embedding_size = 128 + +# Convolution +kernel_size = 5 +filters = 64 +pool_size = 4 + +# LSTM +lstm_output_size = 70 + +# Training +batch_size = 30 +epochs = 2 + +# Data Preparation -------------------------------------------------------- + +# Keras load all data into a list with the following structure: +# List of 2 +# $ train:List of 2 +# ..$ x:List of 25000 +# .. .. [list output truncated] +# .. ..- attr(*, "dim")= int 25000 +# ..$ y: num [1:25000(1d)] 1 0 0 1 0 0 1 0 1 0 ... +# $ test :List of 2 +# ..$ x:List of 25000 +# .. .. [list output truncated] +# .. ..- attr(*, "dim")= int 25000 +# ..$ y: num [1:25000(1d)] 1 1 1 1 1 0 0 0 1 1 ... +# +# The x data includes integer sequences, each integer is a word. +# The y data includes a set of integer labels (0 or 1). +# The num_words argument indicates that only the max_fetures most frequent +# words will be integerized. All other will be ignored. +# See help(dataset_imdb) +imdb <- dataset_imdb(num_words = max_features) + +# pad the sequences, so they have all the same lenght +# this will conver our dataset into a matrix: each line is a review +# and each column a word on the sequence. +# we pad the sequences with 0 to the left. +x_train <- imdb$train$x %>% + pad_sequences(maxlen = maxlen) + +x_test <- imdb$test$x %>% + pad_sequences(maxlen = maxlen) + +# Defining the model ------------------------------------------------------ + +model <- keras_model_sequential() + +model %>% + layer_embedding(max_features, embedding_size, input_length = maxlen) %>% + layer_dropout(0.25) %>% + layer_conv_1d( + filters, + kernel_size, + padding = "valid", + activation = "relu", + strides = 1 + ) %>% + layer_max_pooling_1d(pool_size) %>% + layer_lstm(lstm_output_size) %>% + layer_dense(1) %>% + layer_activation("sigmoid") + +model %>% compile( + loss = "binary_crossentropy", + optimizer = "adam", + metrics = "accuracy" +) + +# Training ---------------------------------------------------------------- + +model %>% fit( + x_train, imdb$train$y, + batch_size = batch_size, + epochs = epochs, + validation_data = list(x_test, imdb$test$y) +) diff --git a/website/articles/examples/imdb_cnn_lstm.html b/website/articles/examples/imdb_cnn_lstm.html new file mode 100644 index 000000000..74dbc8204 --- /dev/null +++ b/website/articles/examples/imdb_cnn_lstm.html @@ -0,0 +1,224 @@ + + + + + + + +imdb_cnn_lstm • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Train a recurrent convolutional network on the IMDB sentiment classification task.

+

Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.

+
library(keras)
+
+# Parameters --------------------------------------------------------------
+
+# Embedding
+max_features = 20000
+maxlen = 100
+embedding_size = 128
+
+# Convolution
+kernel_size = 5
+filters = 64
+pool_size = 4
+
+# LSTM
+lstm_output_size = 70
+
+# Training
+batch_size = 30
+epochs = 2
+
+# Data Preparation --------------------------------------------------------
+
+# Keras load all data into a list with the following structure:
+# List of 2
+# $ train:List of 2
+# ..$ x:List of 25000
+# .. .. [list output truncated]
+# .. ..- attr(*, "dim")= int 25000
+# ..$ y: num [1:25000(1d)] 1 0 0 1 0 0 1 0 1 0 ...
+# $ test :List of 2
+# ..$ x:List of 25000
+# .. .. [list output truncated]
+# .. ..- attr(*, "dim")= int 25000
+# ..$ y: num [1:25000(1d)] 1 1 1 1 1 0 0 0 1 1 ...
+#
+# The x data includes integer sequences, each integer is a word.
+# The y data includes a set of integer labels (0 or 1).
+# The num_words argument indicates that only the max_fetures most frequent
+# words will be integerized. All other will be ignored.
+# See help(dataset_imdb)
+imdb <- dataset_imdb(num_words = max_features)
+
+# pad the sequences, so they have all the same lenght
+# this will conver our dataset into a matrix: each line is a review
+# and each column a word on the sequence. 
+# we pad the sequences with 0 to the left.
+x_train <- imdb$train$x %>%
+  pad_sequences(maxlen = maxlen)
+
+x_test <- imdb$test$x %>%
+  pad_sequences(maxlen = maxlen)
+
+# Defining the model ------------------------------------------------------
+
+model <- keras_model_sequential()
+
+model %>%
+  layer_embedding(max_features, embedding_size, input_length = maxlen) %>%
+  layer_dropout(0.25) %>%
+  layer_conv_1d(
+    filters, 
+    kernel_size, 
+    padding = "valid",
+    activation = "relu",
+    strides = 1
+  ) %>%
+  layer_max_pooling_1d(pool_size) %>%
+  layer_lstm(lstm_output_size) %>%
+  layer_dense(1) %>%
+  layer_activation("sigmoid")
+
+model %>% compile(
+  loss = "binary_crossentropy",
+  optimizer = "adam",
+  metrics = "accuracy"
+)
+
+# Training ----------------------------------------------------------------
+
+model %>% fit(
+  x_train, imdb$train$y,
+  batch_size = batch_size,
+  epochs = epochs,
+  validation_data = list(x_test, imdb$test$y)
+)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/imdb_fasttext.R b/website/articles/examples/imdb_fasttext.R new file mode 100644 index 000000000..8e2066fe8 --- /dev/null +++ b/website/articles/examples/imdb_fasttext.R @@ -0,0 +1,117 @@ +#' This example demonstrates the use of fasttext for text classification +#' +#' Based on Joulin et al's paper: +#' +#' Bags of Tricks for Efficient Text Classification +#' https://arxiv.org/abs/1607.01759 +#' +#' Results on IMDB datasets with uni and bi-gram embeddings: +#' Uni-gram: 0.8813 test accuracy after 5 epochs. 8s/epoch on i7 cpu. +#' Bi-gram : 0.9056 test accuracy after 5 epochs. 2s/epoch on GTx 980M gpu. +#' + +library(keras) +library(purrr) + +# Function definition ----------------------------------------------------- + +create_ngram_set <- function(input_list, ngram_value = 2){ + indices <- map(0:(length(input_list) - ngram_value), ~1:ngram_value + .x) + indices %>% + map_chr(~input_list[.x] %>% paste(collapse = "|")) %>% + unique() +} + +add_ngram <- function(sequences, token_indice, ngram_range = 2){ + ngrams <- map( + sequences, + create_ngram_set, ngram_value = ngram_range + ) + + seqs <- map2(sequences, ngrams, function(x, y){ + tokens <- token_indice$token[token_indice$ngrams %in% y] + c(x, tokens) + }) + + seqs +} + + +# Parameters -------------------------------------------------------------- + +# ngram_range = 2 will add bi-grams features +ngram_range <- 2 +max_features <- 20000 +maxlen <- 400 +batch_size <- 32 +embedding_dims <- 50 +epochs <- 5 + + +# Data preparation -------------------------------------------------------- + +imdb_data <- dataset_imdb(num_words = max_features) + +print(length(imdb_data$train$x)) # train sequences +print(length(imdb_data$test$x)) # test sequences +print(sprintf("Average train sequence length: %f", mean(map_int(imdb_data$train$x, length)))) +print(sprintf("Average test sequence length: %f", mean(map_int(imdb_data$test$x, length)))) + +if(ngram_range > 1) { + + # Create set of unique n-gram from the training set. + ngrams <- imdb_data$train$x %>% + map(create_ngram_set) %>% + unlist() %>% + unique() + + # Dictionary mapping n-gram token to a unique integer. + # Integer values are greater than max_features in order + # to avoid collision with existing features. + token_indice <- data.frame( + ngrams = ngrams, + token = 1:length(ngrams) + (max_features), + stringsAsFactors = FALSE + ) + + # max_features is the highest integer that could be found in the dataset. + max_features <- max(token_indice$token) + 1 + + # Augmenting x_train and x_test with n-grams features + imdb_data$train$x <- add_ngram(imdb_data$train$x, token_indice, ngram_range) + imdb_data$test$x <- add_ngram(imdb_data$test$x, token_indice, ngram_range) +} + +# pad sequences +imdb_data$train$x <- pad_sequences(imdb_data$train$x, maxlen = maxlen) +imdb_data$test$x <- pad_sequences(imdb_data$test$x, maxlen = maxlen) + + +# Model definition -------------------------------------------------------- + +model <- keras_model_sequential() + +model %>% + layer_embedding( + input_dim = max_features, output_dim = embedding_dims, + input_length = maxlen + ) %>% + layer_global_average_pooling_1d() %>% + layer_dense(1, activation = "sigmoid") + +model %>% compile( + loss = "binary_crossentropy", + optimizer = "adam", + metrics = "accuracy" +) + + +# Fitting ----------------------------------------------------------------- + +model %>% fit( + imdb_data$train$x, imdb_data$train$y, + batch_size = batch_size, + epochs = epochs, + validation_data = list(imdb_data$test$x, imdb_data$test$y) +) + diff --git a/website/articles/examples/imdb_fasttext.html b/website/articles/examples/imdb_fasttext.html new file mode 100644 index 000000000..5e2796b8f --- /dev/null +++ b/website/articles/examples/imdb_fasttext.html @@ -0,0 +1,244 @@ + + + + + + + +imdb_fasttext • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This example demonstrates the use of fasttext for text classification

+

Based on Joulin et al’s paper:

+

Bags of Tricks for Efficient Text Classification https://arxiv.org/abs/1607.01759

+

Results on IMDB datasets with uni and bi-gram embeddings: Uni-gram: 0.8813 test accuracy after 5 epochs. 8s/epoch on i7 cpu. Bi-gram : 0.9056 test accuracy after 5 epochs. 2s/epoch on GTx 980M gpu.

+
library(keras)
+library(purrr)
+
+# Function definition -----------------------------------------------------
+
+create_ngram_set <- function(input_list, ngram_value = 2){
+  indices <- map(0:(length(input_list) - ngram_value), ~1:ngram_value + .x)
+  indices %>%
+    map_chr(~input_list[.x] %>% paste(collapse = "|")) %>%
+    unique()
+}
+
+add_ngram <- function(sequences, token_indice, ngram_range = 2){
+  ngrams <- map(
+    sequences, 
+    create_ngram_set, ngram_value = ngram_range
+  )
+  
+  seqs <- map2(sequences, ngrams, function(x, y){
+    tokens <- token_indice$token[token_indice$ngrams %in% y]  
+    c(x, tokens)
+  })
+  
+  seqs
+}
+
+
+# Parameters --------------------------------------------------------------
+
+# ngram_range = 2 will add bi-grams features
+ngram_range <- 2
+max_features <- 20000
+maxlen <- 400
+batch_size <- 32
+embedding_dims <- 50
+epochs <- 5
+
+
+# Data preparation --------------------------------------------------------
+
+imdb_data <- dataset_imdb(num_words = max_features)
+
+print(length(imdb_data$train$x)) # train sequences
+print(length(imdb_data$test$x)) # test sequences
+print(sprintf("Average train sequence length: %f", mean(map_int(imdb_data$train$x, length))))
+print(sprintf("Average test sequence length: %f", mean(map_int(imdb_data$test$x, length))))
+
+if(ngram_range > 1) {
+  
+  # Create set of unique n-gram from the training set.
+  ngrams <- imdb_data$train$x %>% 
+    map(create_ngram_set) %>%
+    unlist() %>%
+    unique()
+
+  # Dictionary mapping n-gram token to a unique integer.
+  # Integer values are greater than max_features in order
+  # to avoid collision with existing features.
+  token_indice <- data.frame(
+    ngrams = ngrams,
+    token  = 1:length(ngrams) + (max_features), 
+    stringsAsFactors = FALSE
+  )
+  
+  # max_features is the highest integer that could be found in the dataset.
+  max_features <- max(token_indice$token) + 1
+  
+  # Augmenting x_train and x_test with n-grams features
+  imdb_data$train$x <- add_ngram(imdb_data$train$x, token_indice, ngram_range)
+  imdb_data$test$x <- add_ngram(imdb_data$test$x, token_indice, ngram_range)
+}
+
+# pad sequences
+imdb_data$train$x <- pad_sequences(imdb_data$train$x, maxlen = maxlen)
+imdb_data$test$x <- pad_sequences(imdb_data$test$x, maxlen = maxlen)
+
+
+# Model definition --------------------------------------------------------
+
+model <- keras_model_sequential()
+
+model %>%
+  layer_embedding(
+    input_dim = max_features, output_dim = embedding_dims, 
+    input_length = maxlen
+    ) %>%
+  layer_global_average_pooling_1d() %>%
+  layer_dense(1, activation = "sigmoid")
+
+model %>% compile(
+  loss = "binary_crossentropy",
+  optimizer = "adam",
+  metrics = "accuracy"
+)
+
+
+# Fitting -----------------------------------------------------------------
+
+model %>% fit(
+  imdb_data$train$x, imdb_data$train$y, 
+  batch_size = batch_size,
+  epochs = epochs,
+  validation_data = list(imdb_data$test$x, imdb_data$test$y)
+)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/imdb_lstm.R b/website/articles/examples/imdb_lstm.R new file mode 100644 index 000000000..4e3d8c983 --- /dev/null +++ b/website/articles/examples/imdb_lstm.R @@ -0,0 +1,64 @@ +#' Trains a LSTM on the IMDB sentiment classification task. +#' +#' The dataset is actually too small for LSTM to be of any advantage compared to +#' simpler, much faster methods such as TF-IDF + LogReg. +#' +#' Notes: +#' +#' - RNNs are tricky. Choice of batch size is important, choice of loss and +#' optimizer is critical, etc. Some configurations won't converge. +#' +#' - LSTM loss decrease patterns during training can be quite different from +#' what you see with CNNs/MLPs/etc. + +library(keras) + +max_features <- 20000 +maxlen <- 80 # cut texts after this number of words (among top max_features most common words) +batch_size <- 32 + +cat('Loading data...\n') +imdb <- dataset_imdb(num_words = max_features) +x_train <- imdb$train$x +y_train <- imdb$train$y +x_test <- imdb$test$x +y_test <- imdb$test$y + +cat(length(x_train), 'train sequences\n') +cat(length(x_test), 'test sequences\n') + +cat('Pad sequences (samples x time)\n') +x_train <- pad_sequences(x_train, maxlen = maxlen) +x_test <- pad_sequences(x_test, maxlen = maxlen) +cat('x_train shape:', dim(x_train), '\n') +cat('x_test shape:', dim(x_test), '\n') + +cat('Build model...\n') +model <- keras_model_sequential() +model %>% + layer_embedding(input_dim = max_features, output_dim = 128) %>% + layer_lstm(units = 64, dropout = 0.2, recurrent_dropout = 0.2) %>% + layer_dense(units = 1, activation = 'sigmoid') + +# try using different optimizers and different optimizer configs +model %>% compile( + loss = 'binary_crossentropy', + optimizer = 'adam', + metrics = c('accuracy') +) + +cat('Train...\n') +model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = 15, + validation_data = list(x_test, y_test) +) +scores <- model %>% evaluate( + x_test, y_test, + batch_size = batch_size +) +cat('Test score:', scores[[1]]) +cat('Test accuracy', scores[[2]]) + + diff --git a/website/articles/examples/imdb_lstm.html b/website/articles/examples/imdb_lstm.html new file mode 100644 index 000000000..8a7e30adc --- /dev/null +++ b/website/articles/examples/imdb_lstm.html @@ -0,0 +1,192 @@ + + + + + + + +imdb_lstm • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Trains a LSTM on the IMDB sentiment classification task.

+

The dataset is actually too small for LSTM to be of any advantage compared to simpler, much faster methods such as TF-IDF + LogReg.

+

Notes:

+
    +
  • RNNs are tricky. Choice of batch size is important, choice of loss and optimizer is critical, etc. Some configurations won’t converge.

  • +
  • LSTM loss decrease patterns during training can be quite different from what you see with CNNs/MLPs/etc.

  • +
+
library(keras)
+
+max_features <- 20000
+maxlen <- 80  # cut texts after this number of words (among top max_features most common words)
+batch_size <- 32
+
+cat('Loading data...\n')
+imdb <- dataset_imdb(num_words = max_features)
+x_train <- imdb$train$x
+y_train <- imdb$train$y
+x_test <- imdb$test$x
+y_test <- imdb$test$y
+
+cat(length(x_train), 'train sequences\n')
+cat(length(x_test), 'test sequences\n')
+
+cat('Pad sequences (samples x time)\n')
+x_train <- pad_sequences(x_train, maxlen = maxlen)
+x_test <- pad_sequences(x_test, maxlen = maxlen)
+cat('x_train shape:', dim(x_train), '\n')
+cat('x_test shape:', dim(x_test), '\n')
+
+cat('Build model...\n')
+model <- keras_model_sequential()
+model %>%
+  layer_embedding(input_dim = max_features, output_dim = 128) %>% 
+  layer_lstm(units = 64, dropout = 0.2, recurrent_dropout = 0.2) %>% 
+  layer_dense(units = 1, activation = 'sigmoid')
+
+# try using different optimizers and different optimizer configs
+model %>% compile(
+  loss = 'binary_crossentropy',
+  optimizer = 'adam',
+  metrics = c('accuracy')
+)
+
+cat('Train...\n')
+model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = 15,
+  validation_data = list(x_test, y_test)
+)
+scores <- model %>% evaluate(
+  x_test, y_test,
+  batch_size = batch_size
+)
+cat('Test score:', scores[[1]])
+cat('Test accuracy', scores[[2]])
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/lstm_benchmark.R b/website/articles/examples/lstm_benchmark.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/lstm_benchmark.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/lstm_benchmark.html b/website/articles/examples/lstm_benchmark.html new file mode 100644 index 000000000..039a698e7 --- /dev/null +++ b/website/articles/examples/lstm_benchmark.html @@ -0,0 +1,137 @@ + + + + + + + +lstm_benchmark • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/lstm_benchmark.py b/website/articles/examples/lstm_benchmark.py new file mode 100755 index 000000000..009f25bdc --- /dev/null +++ b/website/articles/examples/lstm_benchmark.py @@ -0,0 +1,88 @@ +'''Compare LSTM implementations on the IMDB sentiment classification task. + +implementation=0 preprocesses input to the LSTM which typically results in +faster computations at the expense of increased peak memory usage as the +preprocessed input must be kept in memory. + +implementation=1 does away with the preprocessing, meaning that it might take +a little longer, but should require less peak memory. + +implementation=2 concatenates the input, output and forget gate's weights +into one, large matrix, resulting in faster computation time as the GPU can +utilize more cores, at the expense of reduced regularization because the same +dropout is shared across the gates. + +Note that the relative performance of the different implementations can +vary depending on your device, your model and the size of your data. +''' + +import time +import numpy as np +import matplotlib.pyplot as plt + +from keras.preprocessing import sequence +from keras.models import Sequential +from keras.layers import Embedding, Dense, LSTM, Dropout +from keras.datasets import imdb + +max_features = 20000 +max_length = 80 +embedding_dim = 256 +batch_size = 128 +epochs = 10 +modes = [0, 1, 2] + +print('Loading data...') +(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=max_features) +X_train = sequence.pad_sequences(X_train, max_length) +X_test = sequence.pad_sequences(X_test, max_length) + +# Compile and train different models while meauring performance. +results = [] +for mode in modes: + print('Testing mode: implementation={}'.format(mode)) + + model = Sequential() + model.add(Embedding(max_features, embedding_dim, + input_length=max_length)) + model.add(Dropout(0.2)) + model.add(LSTM(embedding_dim, + dropout=0.2, + recurrent_dropout=0.2, + implementation=mode)) + model.add(Dense(1, activation='sigmoid')) + model.compile(loss='binary_crossentropy', + optimizer='adam', + metrics=['accuracy']) + + start_time = time.time() + history = model.fit(X_train, y_train, + batch_size=batch_size, + epochs=epochs, + validation_data=(X_test, y_test)) + average_time_per_epoch = (time.time() - start_time) / epochs + + results.append((history, average_time_per_epoch)) + +# Compare models' accuracy, loss and elapsed time per epoch. +plt.style.use('ggplot') +ax1 = plt.subplot2grid((2, 2), (0, 0)) +ax1.set_title('Accuracy') +ax1.set_ylabel('Validation Accuracy') +ax1.set_xlabel('Epochs') +ax2 = plt.subplot2grid((2, 2), (1, 0)) +ax2.set_title('Loss') +ax2.set_ylabel('Validation Loss') +ax2.set_xlabel('Epochs') +ax3 = plt.subplot2grid((2, 2), (0, 1), rowspan=2) +ax3.set_title('Time') +ax3.set_ylabel('Seconds') +for mode, result in zip(modes, results): + ax1.plot(result[0].epoch, result[0].history['val_acc'], label=mode) + ax2.plot(result[0].epoch, result[0].history['val_loss'], label=mode) +ax1.legend() +ax2.legend() +ax3.bar(np.arange(len(results)), [x[1] for x in results], + tick_label=modes, align='center') +plt.tight_layout() +plt.show() diff --git a/website/articles/examples/lstm_text_generation.R b/website/articles/examples/lstm_text_generation.R new file mode 100644 index 000000000..aceb554e8 --- /dev/null +++ b/website/articles/examples/lstm_text_generation.R @@ -0,0 +1,137 @@ +#' Example script to generate text from Nietzsche's writings. +#' +#' At least 20 epochs are required before the generated text starts sounding +#' coherent. +#' +#' It is recommended to run this script on GPU, as recurrent networks are quite +#' computationally intensive. +#' +#' If you try this script on new data, make sure your corpus has at least ~100k +#' characters. ~1M is better. +#' + +library(keras) +library(readr) +library(stringr) +library(purrr) +library(tokenizers) + + +# Parameters -------------------------------------------------------------- + +maxlen <- 40 + +# Data preparation -------------------------------------------------------- + +path <- get_file( + 'nietzsche.txt', + origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt' + ) + +text <- read_lines(path) %>% + str_to_lower() %>% + str_c(collapse = "\n") %>% + tokenize_characters(strip_non_alphanum = FALSE, simplify = TRUE) + +print(sprintf("corpus length: %d", length(text))) + +chars <- text %>% + unique() %>% + sort() + +print(sprintf("total chars: %d", length(chars))) + +# cut the text in semi-redundant sequences of maxlen characters +dataset <- map( + seq(1, length(text) - maxlen - 1, by = 3), + ~list(sentece = text[.x:(.x + maxlen - 1)], next_char = text[.x + maxlen]) + ) + +dataset <- transpose(dataset) + +# vectorization +X <- array(0, dim = c(length(dataset$sentece), maxlen, length(chars))) +y <- array(0, dim = c(length(dataset$sentece), length(chars))) + +for(i in 1:length(dataset$sentece)){ + + X[i,,] <- sapply(chars, function(x){ + as.integer(x == dataset$sentece[[i]]) + }) + + y[i,] <- as.integer(chars == dataset$next_char[[i]]) + +} + +# Model definition -------------------------------------------------------- + +model <- keras_model_sequential() + +model %>% + layer_lstm(128, input_shape = c(maxlen, length(chars))) %>% + layer_dense(length(chars)) %>% + layer_activation("softmax") + +optimizer <- optimizer_rmsprop(lr = 0.01) + +model %>% compile( + loss = "categorical_crossentropy", + optimizer = optimizer +) + + +# Training and results ---------------------------------------------------- + +sample_mod <- function(preds, temperature = 1){ + preds <- log(preds)/temperature + exp_preds <- exp(preds) + preds <- exp_preds/sum(exp(preds)) + + rmultinom(1, 1, preds) %>% + as.integer() %>% + which.max() +} + +for(iteration in 1:60){ + + cat(sprintf("iteration: %02d ---------------\n\n", iteration)) + + model %>% fit( + X, y, + batch_size = 128, + epochs = 1 + ) + + for(diversity in c(0.2, 0.5, 1, 1.2)){ + + cat(sprintf("diversity: %f ---------------\n\n", diversity)) + + start_index <- sample(1:(length(text) - maxlen), size = 1) + sentence <- text[start_index:(start_index + maxlen - 1)] + generated <- "" + + for(i in 1:400){ + + x <- sapply(chars, function(x){ + as.integer(x == sentence) + }) + dim(x) <- c(1, dim(x)) + + preds <- predict(model, x) + next_index <- sample_mod(preds, diversity) + next_char <- chars[next_index] + + generated <- str_c(generated, next_char, collapse = "") + sentence <- c(sentence[-1], next_char) + + } + + cat(generated) + cat("\n\n") + + } +} + + + + diff --git a/website/articles/examples/lstm_text_generation.html b/website/articles/examples/lstm_text_generation.html new file mode 100644 index 000000000..f0feb29d8 --- /dev/null +++ b/website/articles/examples/lstm_text_generation.html @@ -0,0 +1,261 @@ + + + + + + + +lstm_text_generation • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Example script to generate text from Nietzsche’s writings.

+

At least 20 epochs are required before the generated text starts sounding coherent.

+

It is recommended to run this script on GPU, as recurrent networks are quite computationally intensive.

+

If you try this script on new data, make sure your corpus has at least ~100k characters. ~1M is better.

+
library(keras)
+library(readr)
+library(stringr)
+library(purrr)
+library(tokenizers)
+
+
+# Parameters --------------------------------------------------------------
+
+maxlen <- 40
+
+# Data preparation --------------------------------------------------------
+
+path <- get_file(
+  'nietzsche.txt', 
+  origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt'
+  )
+
+text <- read_lines(path) %>%
+  str_to_lower() %>%
+  str_c(collapse = "\n") %>%
+  tokenize_characters(strip_non_alphanum = FALSE, simplify = TRUE)
+
+print(sprintf("corpus length: %d", length(text)))
+
+chars <- text %>%
+  unique() %>%
+  sort()
+
+print(sprintf("total chars: %d", length(chars)))  
+
+# cut the text in semi-redundant sequences of maxlen characters
+dataset <- map(
+  seq(1, length(text) - maxlen - 1, by = 3), 
+  ~list(sentece = text[.x:(.x + maxlen - 1)], next_char = text[.x + maxlen])
+  )
+
+dataset <- transpose(dataset)
+
+# vectorization
+X <- array(0, dim = c(length(dataset$sentece), maxlen, length(chars)))
+y <- array(0, dim = c(length(dataset$sentece), length(chars)))
+
+for(i in 1:length(dataset$sentece)){
+  
+  X[i,,] <- sapply(chars, function(x){
+    as.integer(x == dataset$sentece[[i]])
+  })
+  
+  y[i,] <- as.integer(chars == dataset$next_char[[i]])
+  
+}
+
+# Model definition --------------------------------------------------------
+
+model <- keras_model_sequential()
+
+model %>%
+  layer_lstm(128, input_shape = c(maxlen, length(chars))) %>%
+  layer_dense(length(chars)) %>%
+  layer_activation("softmax")
+
+optimizer <- optimizer_rmsprop(lr = 0.01)
+
+model %>% compile(
+  loss = "categorical_crossentropy", 
+  optimizer = optimizer
+)
+
+
+# Training and results ----------------------------------------------------
+
+sample_mod <- function(preds, temperature = 1){
+  preds <- log(preds)/temperature
+  exp_preds <- exp(preds)
+  preds <- exp_preds/sum(exp(preds))
+  
+  rmultinom(1, 1, preds) %>% 
+    as.integer() %>%
+    which.max()
+}
+
+for(iteration in 1:60){
+  
+  cat(sprintf("iteration: %02d ---------------\n\n", iteration))
+  
+  model %>% fit(
+    X, y,
+    batch_size = 128,
+    epochs = 1
+  )
+  
+  for(diversity in c(0.2, 0.5, 1, 1.2)){
+    
+    cat(sprintf("diversity: %f ---------------\n\n", diversity))
+    
+    start_index <- sample(1:(length(text) - maxlen), size = 1)
+    sentence <- text[start_index:(start_index + maxlen - 1)]
+    generated <- ""
+    
+    for(i in 1:400){
+      
+      x <- sapply(chars, function(x){
+        as.integer(x == sentence)
+      })
+      dim(x) <- c(1, dim(x))
+      
+      preds <- predict(model, x)
+      next_index <- sample_mod(preds, diversity)
+      next_char <- chars[next_index]
+      
+      generated <- str_c(generated, next_char, collapse = "")
+      sentence <- c(sentence[-1], next_char)
+      
+    }
+    
+    cat(generated)
+    cat("\n\n")
+    
+  }
+}
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_acgan.R b/website/articles/examples/mnist_acgan.R new file mode 100644 index 000000000..df099c53f --- /dev/null +++ b/website/articles/examples/mnist_acgan.R @@ -0,0 +1,351 @@ +#' Train an Auxiliary Classifier Generative Adversarial Network (ACGAN) on the +#' MNIST dataset. See https://arxiv.org/abs/1610.09585 for more details. +#' +#' You should start to see reasonable images after ~5 epochs, and good images by +#' ~15 epochs. You should use a GPU, as the convolution-heavy operations are +#' very slow on the CPU. Prefer the TensorFlow backend if you plan on iterating, +#' as the compilation time can be a blocker using Theano. +#' +#' Hardware | Backend | Time / Epoch +#' | -----------------| ------- | ------------------- | +#' | CPU | TF | 3 hrs | +#' | Titan X (maxwell) | TF | 4 min | +#' | Titan X (maxwell) | TH | 7 min | +#' + +library(keras) +library(progress) +library(abind) +K <- keras::backend() +K$set_image_data_format('channels_first') + +# Functions --------------------------------------------------------------- + +build_generator <- function(latent_size){ + + # we will map a pair of (z, L), where z is a latent vector and L is a + # label drawn from P_c, to image space (..., 1, 28, 28) + cnn <- keras_model_sequential() + + cnn %>% + layer_dense(1024, input_shape = latent_size, activation = "relu") %>% + layer_dense(128*7*7, activation = "relu") %>% + layer_reshape(c(128, 7, 7)) %>% + # upsample to (..., 14, 14) + layer_upsampling_2d(size = c(2, 2)) %>% + layer_conv_2d( + 256, c(5,5), padding = "same", activation = "relu", + kernel_initializer = "glorot_normal" + ) %>% + # upsample to (..., 28, 28) + layer_upsampling_2d(size = c(2, 2)) %>% + layer_conv_2d( + 128, c(5,5), padding = "same", activation = "tanh", + kernel_initializer = "glorot_normal" + ) %>% + # take a channel axis reduction + layer_conv_2d( + 1, c(2,2), padding = "same", activation = "tanh", + kernel_initializer = "glorot_normal" + ) + + + # this is the z space commonly refered to in GAN papers + latent <- layer_input(shape = list(latent_size)) + + # this will be our label + image_class <- layer_input(shape = list(1)) + + # 10 classes in MNIST + cls <- image_class %>% + layer_embedding( + input_dim = 10, output_dim = latent_size, + embeddings_initializer='glorot_normal' + ) %>% + layer_flatten() + + + # hadamard product between z-space and a class conditional embedding + h <- layer_multiply(list(latent, cls)) + + fake_image <- cnn(h) + + keras_model(list(latent, image_class), fake_image) +} + +build_discriminator <- function(){ + + # build a relatively standard conv net, with LeakyReLUs as suggested in + # the reference paper + cnn <- keras_model_sequential() + + cnn %>% + layer_conv_2d( + 32, c(3,3), padding = "same", strides = c(2,2), + input_shape = c(1, 28, 28) + ) %>% + layer_activation_leaky_relu() %>% + layer_dropout(0.3) %>% + + layer_conv_2d(64, c(3, 3), padding = "same", strides = c(1,1)) %>% + layer_activation_leaky_relu() %>% + layer_dropout(0.3) %>% + + layer_conv_2d(128, c(3, 3), padding = "same", strides = c(2,2)) %>% + layer_activation_leaky_relu() %>% + layer_dropout(0.3) %>% + + layer_conv_2d(256, c(3, 3), padding = "same", strides = c(1,1)) %>% + layer_activation_leaky_relu() %>% + layer_dropout(0.3) %>% + + layer_flatten() + + + + image <- layer_input(shape = c(1, 28, 28)) + features <- cnn(image) + + # first output (name=generation) is whether or not the discriminator + # thinks the image that is being shown is fake, and the second output + # (name=auxiliary) is the class that the discriminator thinks the image + # belongs to. + fake <- features %>% + layer_dense(1, activation = "sigmoid", name = "generation") + + aux <- features %>% + layer_dense(10, activation = "softmax", name = "auxiliary") + + keras_model(image, list(fake, aux)) +} + +# Parameters -------------------------------------------------------------- + +# batch and latent size taken from the paper +epochs <- 50 +batch_size <- 100 +latent_size <- 100 + +# Adam parameters suggested in https://arxiv.org/abs/1511.06434 +adam_lr <- 0.00005 +adam_beta_1 <- 0.5 + +# Model definition -------------------------------------------------------- + +# build the discriminator +discriminator <- build_discriminator() +discriminator %>% compile( + optimizer = optimizer_adam(lr = adam_lr, beta_1 = adam_beta_1), + loss = list("binary_crossentropy", "sparse_categorical_crossentropy") +) + +# build the generator +generator <- build_generator(latent_size) +generator %>% compile( + optimizer = optimizer_adam(lr = adam_lr, beta_1 = adam_beta_1), + loss = "binary_crossentropy" +) + +latent <- layer_input(shape = list(latent_size)) +image_class <- layer_input(shape = list(1), dtype = "int32") + +fake <- generator(list(latent, image_class)) + +# we only want to be able to train generation for the combined model + +discriminator$trainable <- FALSE +results <- discriminator(fake) + +combined <- keras_model(list(latent, image_class), results) +combined %>% compile( + optimizer = optimizer_adam(lr = adam_lr, beta_1 = adam_beta_1), + loss = list("binary_crossentropy", "sparse_categorical_crossentropy") +) + + +# Data preparation -------------------------------------------------------- + +# get our mnist data, and force it to be of shape (..., 1, 28, 28) with +# range [-1, 1] +mnist <- dataset_mnist() +mnist$train$x <- (mnist$train$x - 127.5)/127.5 +mnist$test$x <- (mnist$test$x - 127.5)/127.5 +dim(mnist$train$x) <- c(60000, 1, 28, 28) +dim(mnist$test$x) <- c(10000, 1, 28, 28) + +num_train <- dim(mnist$train$x)[1] +num_test <- dim(mnist$test$x)[1] + +# Training ---------------------------------------------------------------- + +for(epoch in 1:epochs){ + + num_batches <- trunc(num_train/batch_size) + pb <- progress_bar$new( + total = num_batches, + format = sprintf("epoch %s/%s :elapsed [:bar] :percent :eta", epoch, epochs), + clear = FALSE + ) + + epoch_gen_loss <- NULL + epoch_disc_loss <- NULL + + possible_indexes <- 1:num_train + + for(index in 1:num_batches){ + + pb$tick() + + # generate a new batch of noise + noise <- runif(n = batch_size*latent_size, min = -1, max = 1) %>% + matrix(nrow = batch_size, ncol = latent_size) + + # get a batch of real images + batch <- sample(possible_indexes, size = batch_size) + possible_indexes <- possible_indexes[!possible_indexes %in% batch] + image_batch <- mnist$train$x[batch,,,,drop = FALSE] + label_batch <- mnist$train$y[batch] + + # sample some labels from p_c + sampled_labels <- sample(0:9, batch_size, replace = TRUE) %>% + matrix(ncol = 1) + + # generate a batch of fake images, using the generated labels as a + # conditioner. We reshape the sampled labels to be + # (batch_size, 1) so that we can feed them into the embedding + # layer as a length one sequence + generated_images <- predict(generator, list(noise, sampled_labels)) + + X <- abind(image_batch, generated_images, along = 1) + y <- c(rep(1L, batch_size), rep(0L, batch_size)) %>% matrix(ncol = 1) + aux_y <- c(label_batch, sampled_labels) %>% matrix(ncol = 1) + + # see if the discriminator can figure itself out... + disc_loss <- train_on_batch( + discriminator, x = X, + y = list(y, aux_y) + ) + + epoch_disc_loss <- rbind(epoch_disc_loss, unlist(disc_loss)) + + # make new noise. we generate 2 * batch size here such that we have + # the generator optimize over an identical number of images as the + # discriminator + noise <- runif(2*batch_size*latent_size, min = -1, max = 1) %>% + matrix(nrow = 2*batch_size, ncol = latent_size) + sampled_labels <- sample(0:9, size = 2*batch_size, replace = TRUE) %>% + matrix(ncol = 1) + + # we want to train the generator to trick the discriminator + # For the generator, we want all the {fake, not-fake} labels to say + # not-fake + trick <- rep(1, 2*batch_size) %>% matrix(ncol = 1) + + combined_loss <- train_on_batch( + combined, + list(noise, sampled_labels), + list(trick, sampled_labels) + ) + + epoch_gen_loss <- rbind(epoch_gen_loss, unlist(combined_loss)) + + } + + cat(sprintf("\nTesting for epoch %02d:", epoch)) + + # evaluate the testing loss here + + # generate a new batch of noise + noise <- runif(num_test*latent_size, min = -1, max = 1) %>% + matrix(nrow = num_test, ncol = latent_size) + + # sample some labels from p_c and generate images from them + sampled_labels <- sample(0:9, size = num_test, replace = TRUE) %>% + matrix(ncol = 1) + generated_images <- predict(generator, list(noise, sampled_labels)) + + X <- abind(mnist$test$x, generated_images, along = 1) + y <- c(rep(1, num_test), rep(0, num_test)) %>% matrix(ncol = 1) + aux_y <- c(mnist$test$y, sampled_labels) %>% matrix(ncol = 1) + + # see if the discriminator can figure itself out... + discriminator_test_loss <- evaluate( + discriminator, X, list(y, aux_y), + verbose = FALSE + ) %>% unlist() + + discriminator_train_loss <- apply(epoch_disc_loss, 2, mean) + + # make new noise + noise <- runif(2*num_test*latent_size, min = -1, max = 1) %>% + matrix(nrow = 2*num_test, ncol = latent_size) + sampled_labels <- sample(0:9, size = 2*num_test, replace = TRUE) %>% + matrix(ncol = 1) + + trick <- rep(1, 2*num_test) %>% matrix(ncol = 1) + + generator_test_loss = combined %>% evaluate( + list(noise, sampled_labels), + list(trick, sampled_labels), + verbose = FALSE + ) + + generator_train_loss <- apply(epoch_gen_loss, 2, mean) + + + # generate an epoch report on performance + row_fmt <- "\n%22s : loss %4.2f | %5.2f | %5.2f" + cat(sprintf( + row_fmt, + "generator (train)", + generator_train_loss[1], + generator_train_loss[2], + generator_train_loss[3] + )) + cat(sprintf( + row_fmt, + "generator (test)", + generator_test_loss[1], + generator_test_loss[2], + generator_test_loss[3] + )) + + cat(sprintf( + row_fmt, + "discriminator (train)", + discriminator_train_loss[1], + discriminator_train_loss[2], + discriminator_train_loss[3] + )) + cat(sprintf( + row_fmt, + "discriminator (test)", + discriminator_test_loss[1], + discriminator_test_loss[2], + discriminator_test_loss[3] + )) + + cat("\n") + + # generate some digits to display + noise <- runif(10*latent_size, min = -1, max = 1) %>% + matrix(nrow = 10, ncol = latent_size) + + sampled_labels <- 0:9 %>% + matrix(ncol = 1) + + # get a batch to display + generated_images <- predict( + generator, + list(noise, sampled_labels) + ) + + img <- NULL + for(i in 1:10){ + img <- cbind(img, generated_images[i,,,]) + } + + ((img + 1)/2) %>% as.raster() %>% + plot() + +} diff --git a/website/articles/examples/mnist_acgan.html b/website/articles/examples/mnist_acgan.html new file mode 100644 index 000000000..22af3e1de --- /dev/null +++ b/website/articles/examples/mnist_acgan.html @@ -0,0 +1,498 @@ + + + + + + + +mnist_acgan • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Train an Auxiliary Classifier Generative Adversarial Network (ACGAN) on the MNIST dataset. See https://arxiv.org/abs/1610.09585 for more details.

+

You should start to see reasonable images after ~5 epochs, and good images by ~15 epochs. You should use a GPU, as the convolution-heavy operations are very slow on the CPU. Prefer the TensorFlow backend if you plan on iterating, as the compilation time can be a blocker using Theano.

+ + + + + + + + + + + + + + + + + + + + + + + +
HardwareBackendTime / Epoch
CPUTF3 hrs
Titan X (maxwell)TF4 min
Titan X (maxwell)TH7 min
+
library(keras)
+library(progress)
+library(abind)
+K <- keras::backend()
+K$set_image_data_format('channels_first')
+
+# Functions ---------------------------------------------------------------
+
+build_generator <- function(latent_size){
+  
+  # we will map a pair of (z, L), where z is a latent vector and L is a
+  # label drawn from P_c, to image space (..., 1, 28, 28)
+  cnn <- keras_model_sequential()
+  
+  cnn %>%
+    layer_dense(1024, input_shape = latent_size, activation = "relu") %>%
+    layer_dense(128*7*7, activation = "relu") %>%
+    layer_reshape(c(128, 7, 7)) %>%
+    # upsample to (..., 14, 14)
+    layer_upsampling_2d(size = c(2, 2)) %>%
+    layer_conv_2d(
+      256, c(5,5), padding = "same", activation = "relu",
+      kernel_initializer = "glorot_normal"
+    ) %>%
+    # upsample to (..., 28, 28)
+    layer_upsampling_2d(size = c(2, 2)) %>%
+    layer_conv_2d(
+      128, c(5,5), padding = "same", activation = "tanh",
+      kernel_initializer = "glorot_normal"
+    ) %>%
+    # take a channel axis reduction
+    layer_conv_2d(
+      1, c(2,2), padding = "same", activation = "tanh",
+      kernel_initializer = "glorot_normal"
+    )
+  
+  
+  # this is the z space commonly refered to in GAN papers
+  latent <- layer_input(shape = list(latent_size))
+  
+  # this will be our label
+  image_class <- layer_input(shape = list(1))
+  
+  # 10 classes in MNIST
+  cls <-  image_class %>%
+    layer_embedding(
+      input_dim = 10, output_dim = latent_size, 
+      embeddings_initializer='glorot_normal'
+    ) %>%
+    layer_flatten()
+  
+  
+  # hadamard product between z-space and a class conditional embedding
+  h <- layer_multiply(list(latent, cls))
+  
+  fake_image <- cnn(h)
+  
+  keras_model(list(latent, image_class), fake_image)
+}
+
+build_discriminator <- function(){
+  
+  # build a relatively standard conv net, with LeakyReLUs as suggested in
+  # the reference paper
+  cnn <- keras_model_sequential()
+  
+  cnn %>%
+    layer_conv_2d(
+      32, c(3,3), padding = "same", strides = c(2,2),
+      input_shape = c(1, 28, 28)
+    ) %>%
+    layer_activation_leaky_relu() %>%
+    layer_dropout(0.3) %>%
+    
+    layer_conv_2d(64, c(3, 3), padding = "same", strides = c(1,1)) %>%
+    layer_activation_leaky_relu() %>%
+    layer_dropout(0.3) %>%  
+    
+    layer_conv_2d(128, c(3, 3), padding = "same", strides = c(2,2)) %>%
+    layer_activation_leaky_relu() %>%
+    layer_dropout(0.3) %>%  
+    
+    layer_conv_2d(256, c(3, 3), padding = "same", strides = c(1,1)) %>%
+    layer_activation_leaky_relu() %>%
+    layer_dropout(0.3) %>%  
+    
+    layer_flatten()
+  
+  
+  
+  image <- layer_input(shape = c(1, 28, 28))
+  features <- cnn(image)
+  
+  # first output (name=generation) is whether or not the discriminator
+  # thinks the image that is being shown is fake, and the second output
+  # (name=auxiliary) is the class that the discriminator thinks the image
+  # belongs to.
+  fake <- features %>% 
+    layer_dense(1, activation = "sigmoid", name = "generation")
+  
+  aux <- features %>%
+    layer_dense(10, activation = "softmax", name = "auxiliary")
+  
+  keras_model(image, list(fake, aux))
+}
+
+# Parameters --------------------------------------------------------------
+
+# batch and latent size taken from the paper
+epochs <- 50
+batch_size <- 100
+latent_size <- 100
+
+# Adam parameters suggested in https://arxiv.org/abs/1511.06434
+adam_lr <- 0.00005 
+adam_beta_1 <- 0.5
+
+# Model definition --------------------------------------------------------
+
+# build the discriminator
+discriminator <- build_discriminator()
+discriminator %>% compile(
+  optimizer = optimizer_adam(lr = adam_lr, beta_1 = adam_beta_1),
+  loss = list("binary_crossentropy", "sparse_categorical_crossentropy")
+)
+
+# build the generator
+generator <- build_generator(latent_size)
+generator %>% compile(
+  optimizer = optimizer_adam(lr = adam_lr, beta_1 = adam_beta_1),
+  loss = "binary_crossentropy"
+)
+
+latent <- layer_input(shape = list(latent_size))
+image_class <- layer_input(shape = list(1), dtype = "int32")
+
+fake <- generator(list(latent, image_class))
+
+# we only want to be able to train generation for the combined model
+
+discriminator$trainable <- FALSE
+results <- discriminator(fake)
+
+combined <- keras_model(list(latent, image_class), results)
+combined %>% compile(
+  optimizer = optimizer_adam(lr = adam_lr, beta_1 = adam_beta_1),
+  loss = list("binary_crossentropy", "sparse_categorical_crossentropy")
+)
+
+
+# Data preparation --------------------------------------------------------
+
+# get our mnist data, and force it to be of shape (..., 1, 28, 28) with
+# range [-1, 1]
+mnist <- dataset_mnist()
+mnist$train$x <- (mnist$train$x - 127.5)/127.5
+mnist$test$x <- (mnist$test$x - 127.5)/127.5
+dim(mnist$train$x) <- c(60000, 1, 28, 28) 
+dim(mnist$test$x) <- c(10000, 1, 28, 28) 
+
+num_train <- dim(mnist$train$x)[1]
+num_test <- dim(mnist$test$x)[1]
+
+# Training ----------------------------------------------------------------
+
+for(epoch in 1:epochs){
+  
+  num_batches <- trunc(num_train/batch_size)
+  pb <- progress_bar$new(
+    total = num_batches, 
+    format = sprintf("epoch %s/%s :elapsed [:bar] :percent :eta", epoch, epochs),
+    clear = FALSE
+  )
+  
+  epoch_gen_loss <- NULL
+  epoch_disc_loss <- NULL
+  
+  possible_indexes <- 1:num_train
+  
+  for(index in 1:num_batches){
+    
+    pb$tick()
+    
+    # generate a new batch of noise
+    noise <- runif(n = batch_size*latent_size, min = -1, max = 1) %>%
+      matrix(nrow = batch_size, ncol = latent_size)
+    
+    # get a batch of real images
+    batch <- sample(possible_indexes, size = batch_size)
+    possible_indexes <- possible_indexes[!possible_indexes %in% batch]
+    image_batch <- mnist$train$x[batch,,,,drop = FALSE]
+    label_batch <- mnist$train$y[batch]
+    
+    # sample some labels from p_c
+    sampled_labels <- sample(0:9, batch_size, replace = TRUE) %>%
+      matrix(ncol = 1)
+    
+    # generate a batch of fake images, using the generated labels as a
+    # conditioner. We reshape the sampled labels to be
+    # (batch_size, 1) so that we can feed them into the embedding
+    # layer as a length one sequence
+    generated_images <- predict(generator, list(noise, sampled_labels))
+    
+    X <- abind(image_batch, generated_images, along = 1)
+    y <- c(rep(1L, batch_size), rep(0L, batch_size)) %>% matrix(ncol = 1)
+    aux_y <- c(label_batch, sampled_labels) %>% matrix(ncol = 1)
+    
+    # see if the discriminator can figure itself out...
+    disc_loss <- train_on_batch(
+      discriminator, x = X, 
+      y = list(y, aux_y)
+    )
+    
+    epoch_disc_loss <- rbind(epoch_disc_loss, unlist(disc_loss))
+    
+    # make new noise. we generate 2 * batch size here such that we have
+    # the generator optimize over an identical number of images as the
+    # discriminator
+    noise <- runif(2*batch_size*latent_size, min = -1, max = 1) %>%
+      matrix(nrow = 2*batch_size, ncol = latent_size)
+    sampled_labels <- sample(0:9, size = 2*batch_size, replace = TRUE) %>%
+      matrix(ncol = 1)
+    
+    # we want to train the generator to trick the discriminator
+    # For the generator, we want all the {fake, not-fake} labels to say
+    # not-fake
+    trick <- rep(1, 2*batch_size) %>% matrix(ncol = 1)
+    
+    combined_loss <- train_on_batch(
+      combined, 
+      list(noise, sampled_labels),
+      list(trick, sampled_labels)
+    )
+    
+    epoch_gen_loss <- rbind(epoch_gen_loss, unlist(combined_loss))
+    
+  }
+  
+  cat(sprintf("\nTesting for epoch %02d:", epoch))
+  
+  # evaluate the testing loss here
+  
+  # generate a new batch of noise
+  noise <- runif(num_test*latent_size, min = -1, max = 1) %>%
+    matrix(nrow = num_test, ncol = latent_size)
+  
+  # sample some labels from p_c and generate images from them
+  sampled_labels <- sample(0:9, size = num_test, replace = TRUE) %>%
+    matrix(ncol = 1)
+  generated_images <- predict(generator, list(noise, sampled_labels))
+  
+  X <- abind(mnist$test$x, generated_images, along = 1)
+  y <- c(rep(1, num_test), rep(0, num_test)) %>% matrix(ncol = 1)
+  aux_y <- c(mnist$test$y, sampled_labels) %>% matrix(ncol = 1)
+  
+  # see if the discriminator can figure itself out...
+  discriminator_test_loss <- evaluate(
+    discriminator, X, list(y, aux_y), 
+    verbose = FALSE
+  ) %>% unlist()
+  
+  discriminator_train_loss <- apply(epoch_disc_loss, 2, mean)
+  
+  # make new noise
+  noise <- runif(2*num_test*latent_size, min = -1, max = 1) %>%
+    matrix(nrow = 2*num_test, ncol = latent_size)
+  sampled_labels <- sample(0:9, size = 2*num_test, replace = TRUE) %>%
+    matrix(ncol = 1)
+  
+  trick <- rep(1, 2*num_test) %>% matrix(ncol = 1)
+  
+  generator_test_loss = combined %>% evaluate(
+    list(noise, sampled_labels),
+    list(trick, sampled_labels),
+    verbose = FALSE
+  )
+  
+  generator_train_loss <- apply(epoch_gen_loss, 2, mean)
+  
+  
+  # generate an epoch report on performance
+  row_fmt <- "\n%22s : loss %4.2f | %5.2f | %5.2f"
+  cat(sprintf(
+    row_fmt, 
+    "generator (train)",
+    generator_train_loss[1],
+    generator_train_loss[2],
+    generator_train_loss[3]
+  ))
+  cat(sprintf(
+    row_fmt, 
+    "generator (test)",
+    generator_test_loss[1],
+    generator_test_loss[2],
+    generator_test_loss[3]
+  ))
+  
+  cat(sprintf(
+    row_fmt, 
+    "discriminator (train)",
+    discriminator_train_loss[1],
+    discriminator_train_loss[2],
+    discriminator_train_loss[3]
+  ))
+  cat(sprintf(
+    row_fmt, 
+    "discriminator (test)",
+    discriminator_test_loss[1],
+    discriminator_test_loss[2],
+    discriminator_test_loss[3]
+  ))
+  
+  cat("\n")
+  
+  # generate some digits to display
+  noise <- runif(10*latent_size, min = -1, max = 1) %>%
+    matrix(nrow = 10, ncol = latent_size)
+  
+  sampled_labels <- 0:9 %>%
+    matrix(ncol = 1)
+  
+  # get a batch to display
+  generated_images <- predict(
+    generator,    
+    list(noise, sampled_labels)
+  )
+  
+  img <- NULL
+  for(i in 1:10){
+    img <- cbind(img, generated_images[i,,,])
+  }
+  
+  ((img + 1)/2) %>% as.raster() %>%
+    plot()
+  
+}
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_antirectifier.R b/website/articles/examples/mnist_antirectifier.R new file mode 100644 index 000000000..c29821da7 --- /dev/null +++ b/website/articles/examples/mnist_antirectifier.R @@ -0,0 +1,121 @@ +#' Demonstrates how to write custom layers for Keras. +#' +#' We build a custom activation layer called 'Antirectifier', which modifies the +#' shape of the tensor that passes through it. We need to specify two methods: +#' `compute_output_shape` and `call`. +#' +#' Note that the same result can also be achieved via a Lambda layer. +#' + +#' ## Data Preparation + +library(keras) + +batch_size <- 128 +num_classes <- 10 +epochs <- 40 + +# the data, shuffled and split between train and test sets +mnist <- dataset_mnist() +x_train <- mnist$train$x +y_train <- mnist$train$y +x_test <- mnist$test$x +y_test <- mnist$test$y + +dim(x_train) <- c(nrow(x_train), 784) +dim(x_test) <- c(nrow(x_test), 784) + +x_train <- x_train / 255 +x_test <- x_test / 255 + +cat(nrow(x_train), 'train samples\n') +cat(nrow(x_test), 'test samples\n') + +# convert class vectors to binary class matrices +y_train <- to_categorical(y_train, num_classes) +y_test <- to_categorical(y_test, num_classes) + +#' ## Antirectifier Layer +#' +#' This is the combination of a sample-wise L2 normalization +#' with the concatenation of the positive part of the input with the negative +#' part of the input. The result is a tensor of samples that are twice as large +#' as the input samples. +#' +#' It can be used in place of a ReLU. +#' +#' Input shape: 2D tensor of shape (samples, n) +#' +#' Output shape: 2D tensor of shape (samples, 2*n) +#' +#' When applying ReLU, assuming that the distribution of the previous output is +#' approximately centered around 0., you are discarding half of your input. This +#' is inefficient. +#' +#' Antirectifier allows to return all-positive outputs like ReLU, without +#' discarding any data. +#' +#' Tests on MNIST show that Antirectifier allows to train networks with twice +#' less parameters yet with comparable classification accuracy as an equivalent +#' ReLU-based network. + +# Because our custom layer is written with primitives from the Keras backend +# (`K`), our code can run both on TensorFlow and Theano. +K <- backend() + +# Custom layer class +AntirectifierLayer <- R6::R6Class("KerasLayer", + + inherit = KerasLayer, + + public = list( + + call = function(x, mask = NULL) { + x <- x - K$mean(x, axis = 1L, keepdims = TRUE) + x <- K$l2_normalize(x, axis = 1L) + pos <- K$relu(x) + neg <- K$relu(-x) + K$concatenate(c(pos, neg), axis = 1L) + + }, + + compute_output_shape = function(input_shape) { + input_shape[[2]] <- input_shape[[2]] * 2 + tuple(input_shape) + } + ) +) + +# create layer wrapper function +layer_antirectifier <- function(object) { + create_layer(AntirectifierLayer, object) +} + + +#' ## Define and Train Model + +model <- keras_model_sequential() +model %>% + layer_dense(units = 256, input_shape = c(784)) %>% + layer_antirectifier() %>% + layer_dropout(rate = 0.1) %>% + layer_dense(units = 256) %>% + layer_antirectifier() %>% + layer_dropout(rate = 0.1) %>% + layer_dense(units = 10, activation = 'softmax') + +# compile the model +model %>% compile( + loss = 'categorical_crossentropy', + optimizer = 'rmsprop', + metrics = c('accuracy') +) + +# train the model +model %>% fit(x_train, y_train, + batch_size = batch_size, + epochs = epochs, + verbose = 1, + validation_data= list(x_test, y_test) +) + diff --git a/website/articles/examples/mnist_antirectifier.html b/website/articles/examples/mnist_antirectifier.html new file mode 100644 index 000000000..59b3a8475 --- /dev/null +++ b/website/articles/examples/mnist_antirectifier.html @@ -0,0 +1,247 @@ + + + + + + + +mnist_antirectifier • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Demonstrates how to write custom layers for Keras.

+

We build a custom activation layer called ‘Antirectifier’, which modifies the shape of the tensor that passes through it. We need to specify two methods: compute_output_shape and call.

+

Note that the same result can also be achieved via a Lambda layer.

+
+

+Data Preparation

+
library(keras)
+
+batch_size <- 128
+num_classes <- 10
+epochs <- 40
+
+# the data, shuffled and split between train and test sets
+mnist <- dataset_mnist()
+x_train <- mnist$train$x
+y_train <- mnist$train$y
+x_test <- mnist$test$x
+y_test <- mnist$test$y
+
+dim(x_train) <- c(nrow(x_train), 784)
+dim(x_test) <- c(nrow(x_test), 784)
+
+x_train <- x_train / 255
+x_test <- x_test / 255
+
+cat(nrow(x_train), 'train samples\n')
+cat(nrow(x_test), 'test samples\n')
+
+# convert class vectors to binary class matrices
+y_train <- to_categorical(y_train, num_classes)
+y_test <- to_categorical(y_test, num_classes)
+
+
+

+Antirectifier Layer

+

This is the combination of a sample-wise L2 normalization with the concatenation of the positive part of the input with the negative part of the input. The result is a tensor of samples that are twice as large as the input samples.

+

It can be used in place of a ReLU.

+

Input shape: 2D tensor of shape (samples, n)

+

Output shape: 2D tensor of shape (samples, 2*n)

+

When applying ReLU, assuming that the distribution of the previous output is approximately centered around 0., you are discarding half of your input. This is inefficient.

+

Antirectifier allows to return all-positive outputs like ReLU, without discarding any data.

+

Tests on MNIST show that Antirectifier allows to train networks with twice less parameters yet with comparable classification accuracy as an equivalent ReLU-based network.

+
# Because our custom layer is written with primitives from the Keras backend
+# (`K`), our code can run both on TensorFlow and Theano.
+K <- backend()
+
+# Custom layer class
+AntirectifierLayer <- R6::R6Class("KerasLayer",
+  
+  inherit = KerasLayer,
+                           
+  public = list(
+   
+    call = function(x, mask = NULL) {
+      x <- x - K$mean(x, axis = 1L, keepdims = TRUE)
+      x <- K$l2_normalize(x, axis = 1L)
+      pos <- K$relu(x)
+      neg <- K$relu(-x)
+      K$concatenate(c(pos, neg), axis = 1L)
+      
+    },
+     
+    compute_output_shape = function(input_shape) {
+      input_shape[[2]] <- input_shape[[2]] * 2 
+      tuple(input_shape)
+    }
+  )
+)
+
+# create layer wrapper function
+layer_antirectifier <- function(object) {
+  create_layer(AntirectifierLayer, object)
+}
+
+
+

+Define and Train Model

+
model <- keras_model_sequential()
+model %>% 
+  layer_dense(units = 256, input_shape = c(784)) %>% 
+  layer_antirectifier() %>% 
+  layer_dropout(rate = 0.1) %>% 
+  layer_dense(units = 256) %>%
+  layer_antirectifier() %>% 
+  layer_dropout(rate = 0.1) %>%
+  layer_dense(units = 10, activation = 'softmax')
+
+# compile the model
+model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = 'rmsprop',
+  metrics = c('accuracy')
+)
+
+# train the model
+model %>% fit(x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  validation_data= list(x_test, y_test)
+)
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_cnn.R b/website/articles/examples/mnist_cnn.R new file mode 100644 index 000000000..fc66693a6 --- /dev/null +++ b/website/articles/examples/mnist_cnn.R @@ -0,0 +1,75 @@ + +#' Trains a simple convnet on the MNIST dataset. +#' +#' Gets to 99.25% test accuracy after 12 epochs +#' (there is still a lot of margin for parameter tuning). +#' 16 seconds per epoch on a GRID K520 GPU. + +library(keras) + +batch_size <- 128 +num_classes <- 10 +epochs <- 12 + +# input image dimensions +img_rows <- 28 +img_cols <- 28 + +# the data, shuffled and split between train and test sets +mnist <- dataset_mnist() +x_train <- mnist$train$x +y_train <- mnist$train$y +x_test <- mnist$test$x +y_test <- mnist$test$y + +dim(x_train) <- c(nrow(x_train), img_rows, img_cols, 1) +dim(x_test) <- c(nrow(x_test), img_rows, img_cols, 1) +input_shape <- c(img_rows, img_cols, 1) + +x_train <- x_train / 255 +x_test <- x_test / 255 + +cat('x_train_shape:', dim(x_train), '\n') +cat(nrow(x_train), 'train samples\n') +cat(nrow(x_test), 'test samples\n') + +# convert class vectors to binary class matrices +y_train <- to_categorical(y_train, num_classes) +y_test <- to_categorical(y_test, num_classes) + +# define model +model <- keras_model_sequential() +model %>% + layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = 'relu', + input_shape = input_shape) %>% + layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% + layer_max_pooling_2d(pool_size = c(2, 2)) %>% + layer_dropout(rate = 0.25) %>% + layer_flatten() %>% + layer_dense(units = 128, activation = 'relu') %>% + layer_dropout(rate = 0.5) %>% + layer_dense(units = num_classes, activation = 'softmax') + +# compile model +model %>% compile( + loss = loss_categorical_crossentropy, + optimizer = optimizer_adadelta(), + metrics = c('accuracy') +) + +# train and evaluate +model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = epochs, + verbose = 1, + validation_data = list(x_test, y_test) +) +scores <- model %>% evaluate( + x_test, y_test, verbose = 0 +) + +cat('Test loss:', scores[[1]], '\n') +cat('Test accuracy:', scores[[2]], '\n') + + diff --git a/website/articles/examples/mnist_cnn.html b/website/articles/examples/mnist_cnn.html new file mode 100644 index 000000000..5b734efc1 --- /dev/null +++ b/website/articles/examples/mnist_cnn.html @@ -0,0 +1,204 @@ + + + + + + + +mnist_cnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Trains a simple convnet on the MNIST dataset.

+

Gets to 99.25% test accuracy after 12 epochs (there is still a lot of margin for parameter tuning). 16 seconds per epoch on a GRID K520 GPU.

+
library(keras)
+
+batch_size <- 128
+num_classes <- 10
+epochs <- 12
+
+# input image dimensions
+img_rows <- 28
+img_cols <- 28
+
+# the data, shuffled and split between train and test sets
+mnist <- dataset_mnist()
+x_train <- mnist$train$x
+y_train <- mnist$train$y
+x_test <- mnist$test$x
+y_test <- mnist$test$y
+
+dim(x_train) <- c(nrow(x_train), img_rows, img_cols, 1) 
+dim(x_test) <- c(nrow(x_test), img_rows, img_cols, 1)
+input_shape <- c(img_rows, img_cols, 1)
+
+x_train <- x_train / 255
+x_test <- x_test / 255
+
+cat('x_train_shape:', dim(x_train), '\n')
+cat(nrow(x_train), 'train samples\n')
+cat(nrow(x_test), 'test samples\n')
+
+# convert class vectors to binary class matrices
+y_train <- to_categorical(y_train, num_classes)
+y_test <- to_categorical(y_test, num_classes)
+
+# define model
+model <- keras_model_sequential()
+model %>%
+  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = 'relu',
+                input_shape = input_shape) %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
+  layer_dropout(rate = 0.25) %>% 
+  layer_flatten() %>% 
+  layer_dense(units = 128, activation = 'relu') %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = num_classes, activation = 'softmax')
+
+# compile model
+model %>% compile(
+  loss = loss_categorical_crossentropy,
+  optimizer = optimizer_adadelta(),
+  metrics = c('accuracy')
+)
+
+# train and evaluate
+model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  validation_data = list(x_test, y_test)
+)
+scores <- model %>% evaluate(
+  x_test, y_test, verbose = 0
+)
+
+cat('Test loss:', scores[[1]], '\n')
+cat('Test accuracy:', scores[[2]], '\n')
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_hierarchical_rnn.R b/website/articles/examples/mnist_hierarchical_rnn.R new file mode 100644 index 000000000..b5b6078b4 --- /dev/null +++ b/website/articles/examples/mnist_hierarchical_rnn.R @@ -0,0 +1,103 @@ +#' This is an example of using Hierarchical RNN (HRNN) to classify MNIST digits. +#' +#' HRNNs can learn across multiple levels of temporal hiearchy over a complex sequence. +#' Usually, the first recurrent layer of an HRNN encodes a sentence (e.g. of word vectors) +#' into a sentence vector. The second recurrent layer then encodes a sequence of +#' such vectors (encoded by the first layer) into a document vector. This +#' document vector is considered to preserve both the word-level and +#' sentence-level structure of the context. +#' +#' References: +#' +#' - [A Hierarchical Neural Autoencoder for Paragraphs and Documents](https://arxiv.org/abs/1506.01057) +#' Encodes paragraphs and documents with HRNN. +#' Results have shown that HRNN outperforms standard +#' RNNs and may play some role in more sophisticated generation tasks like +#' summarization or question answering. +#' - [Hierarchical recurrent neural network for skeleton based action recognition](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298714) +#' Achieved state-of-the-art results on skeleton based action recognition with 3 levels +#' of bidirectional HRNN combined with fully connected layers. +#' +#' In the below MNIST example the first LSTM layer first encodes every +#' column of pixels of shape (28, 1) to a column vector of shape (128,). The second LSTM +#' layer encodes then these 28 column vectors of shape (28, 128) to a image vector +#' representing the whole image. A final Dense layer is added for prediction. +#' +#' After 5 epochs: train acc: 0.9858, val acc: 0.9864 +#' + +library(keras) + +# Training parameters. +batch_size <- 32 +num_classes <- 10 +epochs <- 5 + +# Embedding dimensions. +row_hidden <- 128 +col_hidden <- 128 + +# the data, shuffled and split between train and test sets +mnist <- dataset_mnist() +x_train <- mnist$train$x +y_train <- mnist$train$y +x_test <- mnist$test$x +y_test <- mnist$test$y + +# Reshapes data to 4D for Hierarchical RNN. +dim(x_train) <- c(nrow(x_train), 28, 28, 1) +dim(x_test) <- c(nrow(x_test), 28, 28, 1) +x_train <- x_train / 255 +x_test <- x_test / 255 + +dim_x_train <- dim(x_train) +cat('x_train_shape:', dim_x_train) +cat(nrow(x_train), 'train samples') +cat(nrow(x_test), 'test samples') + +# Converts class vectors to binary class matrices +y_train <- to_categorical(y_train, num_classes) +y_test <- to_categorical(y_test, num_classes) + +row <- dim_x_train[[2]] +col <- dim_x_train[[3]] +pixel <- dim_x_train[[4]] + +# Model input (4D) +input <- layer_input(shape = c(row, col, pixel)) + +# Encodes a row of pixels using TimeDistributed Wrapper +encoded_rows <- input %>% time_distributed(layer_lstm(units = row_hidden)) + +# Encodes columns of encoded rows. +encoded_columns <- encoded_rows %>% layer_lstm(units = col_hidden) + +# Model output +prediction <- encoded_columns %>% + layer_dense(units = num_classes, activation = 'softmax') + +# Define and compile model +model <- keras_model(input, prediction) +model %>% compile( + loss = 'categorical_crossentropy', + optimizer = 'rmsprop', + metrics = c('accuracy') +) + +# Training +model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = epochs, + verbose = 1, + validation_data = list(x_test, y_test) +) + +# Evaluation +scores <- model %>% evaluate(x_test, y_test, verbose = 0) +cat('Test loss:', scores[[1]], '\n') +cat('Test accuracy:', scores[[2]], '\n') + + + + diff --git a/website/articles/examples/mnist_hierarchical_rnn.html b/website/articles/examples/mnist_hierarchical_rnn.html new file mode 100644 index 000000000..8bc0c3c05 --- /dev/null +++ b/website/articles/examples/mnist_hierarchical_rnn.html @@ -0,0 +1,218 @@ + + + + + + + +mnist_hierarchical_rnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This is an example of using Hierarchical RNN (HRNN) to classify MNIST digits.

+

HRNNs can learn across multiple levels of temporal hiearchy over a complex sequence. Usually, the first recurrent layer of an HRNN encodes a sentence (e.g. of word vectors) into a sentence vector. The second recurrent layer then encodes a sequence of such vectors (encoded by the first layer) into a document vector. This document vector is considered to preserve both the word-level and sentence-level structure of the context.

+

References:

+ +

In the below MNIST example the first LSTM layer first encodes every column of pixels of shape (28, 1) to a column vector of shape (128,). The second LSTM layer encodes then these 28 column vectors of shape (28, 128) to a image vector representing the whole image. A final Dense layer is added for prediction.

+

After 5 epochs: train acc: 0.9858, val acc: 0.9864

+
library(keras)
+
+# Training parameters.
+batch_size <- 32
+num_classes <- 10
+epochs <- 5
+
+# Embedding dimensions.
+row_hidden <- 128
+col_hidden <- 128
+
+# the data, shuffled and split between train and test sets
+mnist <- dataset_mnist()
+x_train <- mnist$train$x
+y_train <- mnist$train$y
+x_test <- mnist$test$x
+y_test <- mnist$test$y
+
+# Reshapes data to 4D for Hierarchical RNN.
+dim(x_train) <- c(nrow(x_train), 28, 28, 1) 
+dim(x_test) <- c(nrow(x_test), 28, 28, 1)
+x_train <- x_train / 255
+x_test <- x_test / 255
+
+dim_x_train <- dim(x_train)
+cat('x_train_shape:', dim_x_train)
+cat(nrow(x_train), 'train samples')
+cat(nrow(x_test), 'test samples')
+
+# Converts class vectors to binary class matrices
+y_train <- to_categorical(y_train, num_classes)
+y_test <- to_categorical(y_test, num_classes)
+
+row <- dim_x_train[[2]]
+col <- dim_x_train[[3]]
+pixel <- dim_x_train[[4]]
+
+# Model input (4D)
+input <- layer_input(shape = c(row, col, pixel))
+
+# Encodes a row of pixels using TimeDistributed Wrapper
+encoded_rows <- input %>% time_distributed(layer_lstm(units = row_hidden))
+
+# Encodes columns of encoded rows.
+encoded_columns <- encoded_rows %>% layer_lstm(units = col_hidden)
+
+# Model output
+prediction <- encoded_columns %>%
+  layer_dense(units = num_classes, activation = 'softmax')
+
+# Define and compile model
+model <- keras_model(input, prediction)
+model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = 'rmsprop',
+  metrics = c('accuracy')
+)
+
+# Training
+model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  validation_data = list(x_test, y_test)
+)
+
+# Evaluation
+scores <- model %>% evaluate(x_test, y_test, verbose = 0)
+cat('Test loss:', scores[[1]], '\n')
+cat('Test accuracy:', scores[[2]], '\n')
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_hierarchical_rnn.py b/website/articles/examples/mnist_hierarchical_rnn.py new file mode 100755 index 000000000..06de4171a --- /dev/null +++ b/website/articles/examples/mnist_hierarchical_rnn.py @@ -0,0 +1,91 @@ +"""This is an example of using Hierarchical RNN (HRNN) to classify MNIST digits. + +HRNNs can learn across multiple levels of temporal hiearchy over a complex sequence. +Usually, the first recurrent layer of an HRNN encodes a sentence (e.g. of word vectors) +into a sentence vector. The second recurrent layer then encodes a sequence of +such vectors (encoded by the first layer) into a document vector. This +document vector is considered to preserve both the word-level and +sentence-level structure of the context. + +# References + - [A Hierarchical Neural Autoencoder for Paragraphs and Documents](https://arxiv.org/abs/1506.01057) + Encodes paragraphs and documents with HRNN. + Results have shown that HRNN outperforms standard + RNNs and may play some role in more sophisticated generation tasks like + summarization or question answering. + - [Hierarchical recurrent neural network for skeleton based action recognition](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298714) + Achieved state-of-the-art results on skeleton based action recognition with 3 levels + of bidirectional HRNN combined with fully connected layers. + +In the below MNIST example the first LSTM layer first encodes every +column of pixels of shape (28, 1) to a column vector of shape (128,). The second LSTM +layer encodes then these 28 column vectors of shape (28, 128) to a image vector +representing the whole image. A final Dense layer is added for prediction. + +After 5 epochs: train acc: 0.9858, val acc: 0.9864 +""" +from __future__ import print_function + +import tensorflow.contrib.keras.api.keras as keras +from tensorflow.contrib.keras.api.keras.datasets import mnist +from tensorflow.contrib.keras.api.keras.models import Model +from tensorflow.contrib.keras.api.keras.layers import Input, Dense +from tensorflow.contrib.keras.python.keras.layers import TimeDistributed +from tensorflow.contrib.keras.api.keras.layers import LSTM + +# Training parameters. +batch_size = 32 +num_classes = 10 +epochs = 5 + +# Embedding dimensions. +row_hidden = 128 +col_hidden = 128 + +# The data, shuffled and split between train and test sets. +(x_train, y_train), (x_test, y_test) = mnist.load_data() + +# Reshapes data to 4D for Hierarchical RNN. +x_train = x_train.reshape(x_train.shape[0], 28, 28, 1) +x_test = x_test.reshape(x_test.shape[0], 28, 28, 1) +x_train = x_train.astype('float32') +x_test = x_test.astype('float32') +x_train /= 255 +x_test /= 255 +print('x_train shape:', x_train.shape) +print(x_train.shape[0], 'train samples') +print(x_test.shape[0], 'test samples') + +# Converts class vectors to binary class matrices. +y_train = keras.utils.to_categorical(y_train, num_classes) +y_test = keras.utils.to_categorical(y_test, num_classes) + +row, col, pixel = x_train.shape[1:] + +# 4D input. +x = Input(shape=(row, col, pixel)) + +# Encodes a row of pixels using TimeDistributed Wrapper. +encoded_rows = TimeDistributed(LSTM(row_hidden))(x) + +# Encodes columns of encoded rows. +encoded_columns = LSTM(col_hidden)(encoded_rows) + +# Final predictions and model. +prediction = Dense(num_classes, activation='softmax')(encoded_columns) +model = Model(x, prediction) +model.compile(loss='categorical_crossentropy', + optimizer='rmsprop', + metrics=['accuracy']) + +# Training. +model.fit(x_train, y_train, + batch_size=batch_size, + epochs=epochs, + verbose=1, + validation_data=(x_test, y_test)) + +# Evaluation. +scores = model.evaluate(x_test, y_test, verbose=0) +print('Test loss:', scores[0]) +print('Test accuracy:', scores[1]) diff --git a/website/articles/examples/mnist_irnn.R b/website/articles/examples/mnist_irnn.R new file mode 100644 index 000000000..c7a5a8176 --- /dev/null +++ b/website/articles/examples/mnist_irnn.R @@ -0,0 +1,85 @@ +#' This is a reproduction of the IRNN experiment +#' with pixel-by-pixel sequential MNIST in +#' "A Simple Way to Initialize Recurrent Networks of Rectified Linear Units" +#' by Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton +#' +#' arxiv:1504.00941v2 [cs.NE] 7 Apr 2015 +#' http://arxiv.org/pdf/1504.00941v2.pdf +#' +#' Optimizer is replaced with RMSprop which yields more stable and steady +#' improvement. +#' +#' Reaches 0.93 train/test accuracy after 900 epochs +#' (which roughly corresponds to 1687500 steps in the original paper.) + +library(keras) + +batch_size <- 32 +num_classes <- 10 +epochs <- 200 +hidden_units <- 100 + +img_rows <- 28 +img_cols <- 28 + +learning_rate <- 1e-6 +clip_norm <- 1.0 + +# the data, shuffled and split between train and test sets +mnist <- dataset_mnist() +x_train <- mnist$train$x +y_train <- mnist$train$y +x_test <- mnist$test$x +y_test <- mnist$test$y + +dim(x_train) <- c(nrow(x_train), img_rows * img_cols, 1) +dim(x_test) <- c(nrow(x_test), img_rows * img_cols, 1) +input_shape <- c(img_rows, img_cols, 1) + +x_train <- x_train / 255 +x_test <- x_test / 255 + +cat('x_train_shape:', dim(x_train), '\n') +cat(nrow(x_train), 'train samples\n') +cat(nrow(x_test), 'test samples\n') + +# convert class vectors to binary class matrices +y_train <- to_categorical(y_train, num_classes) +y_test <- to_categorical(y_test, num_classes) + +cat("Evaliate IRNN...\n") +model <- keras_model_sequential() +model %>% + layer_simple_rnn(units = hidden_units, + kernel_initializer = initializer_random_normal(stddev = 0.01), + recurrent_initializer = initializer_identity(gain = 1.0), + activation = 'relu', + input_shape = dim(x_train)[-1]) %>% + layer_dense(units = num_classes) %>% + layer_activation(activation = 'softmax') + +model %>% compile( + loss = 'categorical_crossentropy', + optimizer = optimizer_rmsprop(lr = learning_rate), + metrics = c('accuracy') +) + +model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = epochs, + verbose = 1, + validation_data = list(x_test, y_test) +) + +scores <- model %>% evaluate(x_test, y_test, verbose = 0) +cat('IRNN test score:', scores[[1]], '\n') +cat('IRNN test accuracy:', scores[[2]], '\n') + + + + + + + + diff --git a/website/articles/examples/mnist_irnn.html b/website/articles/examples/mnist_irnn.html new file mode 100644 index 000000000..81b1e25e4 --- /dev/null +++ b/website/articles/examples/mnist_irnn.html @@ -0,0 +1,203 @@ + + + + + + + +mnist_irnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This is a reproduction of the IRNN experiment with pixel-by-pixel sequential MNIST in “A Simple Way to Initialize Recurrent Networks of Rectified Linear Units” by Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton

+

arxiv:1504.00941v2 [cs.NE] 7 Apr 2015 http://arxiv.org/pdf/1504.00941v2.pdf

+

Optimizer is replaced with RMSprop which yields more stable and steady improvement.

+

Reaches 0.93 train/test accuracy after 900 epochs (which roughly corresponds to 1687500 steps in the original paper.)

+
library(keras)
+
+batch_size <- 32
+num_classes <- 10
+epochs <- 200
+hidden_units <- 100
+
+img_rows <- 28
+img_cols <- 28
+
+learning_rate <- 1e-6
+clip_norm <- 1.0
+
+# the data, shuffled and split between train and test sets
+mnist <- dataset_mnist()
+x_train <- mnist$train$x
+y_train <- mnist$train$y
+x_test <- mnist$test$x
+y_test <- mnist$test$y
+
+dim(x_train) <- c(nrow(x_train), img_rows * img_cols, 1)
+dim(x_test) <- c(nrow(x_test), img_rows * img_cols, 1)
+input_shape <- c(img_rows, img_cols, 1)
+
+x_train <- x_train / 255
+x_test <- x_test / 255
+
+cat('x_train_shape:', dim(x_train), '\n')
+cat(nrow(x_train), 'train samples\n')
+cat(nrow(x_test), 'test samples\n')
+
+# convert class vectors to binary class matrices
+y_train <- to_categorical(y_train, num_classes)
+y_test <- to_categorical(y_test, num_classes)
+
+cat("Evaliate IRNN...\n")
+model <- keras_model_sequential()
+model %>% 
+  layer_simple_rnn(units = hidden_units,
+                   kernel_initializer = initializer_random_normal(stddev = 0.01),
+                   recurrent_initializer = initializer_identity(gain = 1.0),
+                   activation = 'relu',
+                   input_shape = dim(x_train)[-1]) %>% 
+  layer_dense(units = num_classes) %>% 
+  layer_activation(activation = 'softmax')
+
+model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = optimizer_rmsprop(lr = learning_rate),
+  metrics = c('accuracy')
+)
+  
+model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  validation_data = list(x_test, y_test)
+)
+  
+scores <- model %>% evaluate(x_test, y_test, verbose = 0)
+cat('IRNN test score:', scores[[1]], '\n')
+cat('IRNN test accuracy:', scores[[2]], '\n')
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_mlp.R b/website/articles/examples/mnist_mlp.R new file mode 100644 index 000000000..633147fb3 --- /dev/null +++ b/website/articles/examples/mnist_mlp.R @@ -0,0 +1,67 @@ +#' Trains a simple deep NN on the MNIST dataset. +#' +#' Gets to 98.40% test accuracy after 20 epochs +#' (there is *a lot* of margin for parameter tuning). +#' 2 seconds per epoch on a K520 GPU. +#' + +library(keras) + +batch_size <- 128 +num_classes <- 10 +epochs <- 30 + +# the data, shuffled and split between train and test sets +mnist <- dataset_mnist() +x_train <- mnist$train$x +y_train <- mnist$train$y +x_test <- mnist$test$x +y_test <- mnist$test$y + +dim(x_train) <- c(nrow(x_train), 784) +dim(x_test) <- c(nrow(x_test), 784) + +x_train <- x_train / 255 +x_test <- x_test / 255 + +cat(nrow(x_train), 'train samples\n') +cat(nrow(x_test), 'test samples\n') + +# convert class vectors to binary class matrices +y_train <- to_categorical(y_train, num_classes) +y_test <- to_categorical(y_test, num_classes) + +model <- keras_model_sequential() +model %>% + layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>% + layer_dropout(rate = 0.4) %>% + layer_dense(units = 128, activation = 'relu') %>% + layer_dropout(rate = 0.3) %>% + layer_dense(units = 10, activation = 'softmax') + +summary(model) + +model %>% compile( + loss = 'categorical_crossentropy', + optimizer = optimizer_rmsprop(), + metrics = c('accuracy') +) + +history <- model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = epochs, + verbose = 1, + validation_split = 0.2 +) + +plot(history) + +score <- model %>% evaluate( + x_test, y_test, + verbose = 0 +) + +cat('Test loss:', score[[1]], '\n') +cat('Test accuracy:', score[[2]], '\n') + diff --git a/website/articles/examples/mnist_mlp.html b/website/articles/examples/mnist_mlp.html new file mode 100644 index 000000000..a184b853d --- /dev/null +++ b/website/articles/examples/mnist_mlp.html @@ -0,0 +1,197 @@ + + + + + + + +mnist_mlp • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Trains a simple deep NN on the MNIST dataset.

+

Gets to 98.40% test accuracy after 20 epochs (there is a lot of margin for parameter tuning). 2 seconds per epoch on a K520 GPU.

+
library(keras)
+
+batch_size <- 128
+num_classes <- 10
+epochs <- 30
+
+# the data, shuffled and split between train and test sets
+mnist <- dataset_mnist()
+x_train <- mnist$train$x
+y_train <- mnist$train$y
+x_test <- mnist$test$x
+y_test <- mnist$test$y
+
+dim(x_train) <- c(nrow(x_train), 784)
+dim(x_test) <- c(nrow(x_test), 784)
+
+x_train <- x_train / 255
+x_test <- x_test / 255
+
+cat(nrow(x_train), 'train samples\n')
+cat(nrow(x_test), 'test samples\n')
+
+# convert class vectors to binary class matrices
+y_train <- to_categorical(y_train, num_classes)
+y_test <- to_categorical(y_test, num_classes)
+
+model <- keras_model_sequential()
+model %>% 
+  layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>% 
+  layer_dropout(rate = 0.4) %>% 
+  layer_dense(units = 128, activation = 'relu') %>%
+  layer_dropout(rate = 0.3) %>%
+  layer_dense(units = 10, activation = 'softmax')
+
+summary(model)
+
+model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = optimizer_rmsprop(),
+  metrics = c('accuracy')
+)
+
+history <- model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  validation_split = 0.2
+)
+
+plot(history)
+  
+score <- model %>% evaluate(
+  x_test, y_test,
+  verbose = 0
+)
+  
+cat('Test loss:', score[[1]], '\n')
+cat('Test accuracy:', score[[2]], '\n')
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_net2net.R b/website/articles/examples/mnist_net2net.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/mnist_net2net.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/mnist_net2net.html b/website/articles/examples/mnist_net2net.html new file mode 100644 index 000000000..870d7f119 --- /dev/null +++ b/website/articles/examples/mnist_net2net.html @@ -0,0 +1,137 @@ + + + + + + + +mnist_net2net • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_net2net.py b/website/articles/examples/mnist_net2net.py new file mode 100755 index 000000000..260f922f4 --- /dev/null +++ b/website/articles/examples/mnist_net2net.py @@ -0,0 +1,389 @@ +'''This is an implementation of Net2Net experiment with MNIST in +'Net2Net: Accelerating Learning via Knowledge Transfer' +by Tianqi Chen, Ian Goodfellow, and Jonathon Shlens + +arXiv:1511.05641v4 [cs.LG] 23 Apr 2016 +http://arxiv.org/abs/1511.05641 + +Notes +- What: + + Net2Net is a group of methods to transfer knowledge from a teacher neural + net to a student net,so that the student net can be trained faster than + from scratch. + + The paper discussed two specific methods of Net2Net, i.e. Net2WiderNet + and Net2DeeperNet. + + Net2WiderNet replaces a model with an equivalent wider model that has + more units in each hidden layer. + + Net2DeeperNet replaces a model with an equivalent deeper model. + + Both are based on the idea of 'function-preserving transformations of + neural nets'. +- Why: + + Enable fast exploration of multiple neural nets in experimentation and + design process,by creating a series of wider and deeper models with + transferable knowledge. + + Enable 'lifelong learning system' by gradually adjusting model complexity + to data availability,and reusing transferable knowledge. + +Experiments +- Teacher model: a basic CNN model trained on MNIST for 3 epochs. +- Net2WiderNet experiment: + + Student model has a wider Conv2D layer and a wider FC layer. + + Comparison of 'random-padding' vs 'net2wider' weight initialization. + + With both methods, student model should immediately perform as well as + teacher model, but 'net2wider' is slightly better. +- Net2DeeperNet experiment: + + Student model has an extra Conv2D layer and an extra FC layer. + + Comparison of 'random-init' vs 'net2deeper' weight initialization. + + Starting performance of 'net2deeper' is better than 'random-init'. +- Hyper-parameters: + + SGD with momentum=0.9 is used for training teacher and student models. + + Learning rate adjustment: it's suggested to reduce learning rate + to 1/10 for student model. + + Addition of noise in 'net2wider' is used to break weight symmetry + and thus enable full capacity of student models. It is optional + when a Dropout layer is used. + +Results +- Tested with 'Theano' backend and 'channels_first' image_data_format. +- Running on GPU GeForce GTX 980M +- Performance Comparisons - validation loss values during first 3 epochs: +(1) teacher_model: 0.075 0.041 0.041 +(2) wider_random_pad: 0.036 0.034 0.032 +(3) wider_net2wider: 0.032 0.030 0.030 +(4) deeper_random_init: 0.061 0.043 0.041 +(5) deeper_net2deeper: 0.032 0.031 0.029 +''' + +from __future__ import print_function +from six.moves import xrange +import numpy as np +import keras +from keras.models import Sequential +from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten +from keras.optimizers import SGD +from keras.datasets import mnist + +if keras.backend.image_data_format() == 'channels_first': + input_shape = (1, 28, 28) # image shape +else: + input_shape = (28, 28, 1) # image shape +num_class = 10 # number of class + + +# load and pre-process data +def preprocess_input(x): + return x.reshape((-1, ) + input_shape) / 255. + + +def preprocess_output(y): + return keras.utils.to_categorical(y) + +(train_x, train_y), (validation_x, validation_y) = mnist.load_data() +train_x, validation_x = map(preprocess_input, [train_x, validation_x]) +train_y, validation_y = map(preprocess_output, [train_y, validation_y]) +print('Loading MNIST data...') +print('train_x shape:', train_x.shape, 'train_y shape:', train_y.shape) +print('validation_x shape:', validation_x.shape, + 'validation_y shape', validation_y.shape) + + +# knowledge transfer algorithms +def wider2net_conv2d(teacher_w1, teacher_b1, teacher_w2, new_width, init): + '''Get initial weights for a wider conv2d layer with a bigger filters, + by 'random-padding' or 'net2wider'. + + # Arguments + teacher_w1: `weight` of conv2d layer to become wider, + of shape (filters1, num_channel1, kh1, kw1) + teacher_b1: `bias` of conv2d layer to become wider, + of shape (filters1, ) + teacher_w2: `weight` of next connected conv2d layer, + of shape (filters2, num_channel2, kh2, kw2) + new_width: new `filters` for the wider conv2d layer + init: initialization algorithm for new weights, + either 'random-pad' or 'net2wider' + ''' + assert teacher_w1.shape[0] == teacher_w2.shape[1], ( + 'successive layers from teacher model should have compatible shapes') + assert teacher_w1.shape[0] == teacher_b1.shape[0], ( + 'weight and bias from same layer should have compatible shapes') + assert new_width > teacher_w1.shape[0], ( + 'new width (filters) should be bigger than the existing one') + + n = new_width - teacher_w1.shape[0] + if init == 'random-pad': + new_w1 = np.random.normal(0, 0.1, size=(n, ) + teacher_w1.shape[1:]) + new_b1 = np.ones(n) * 0.1 + new_w2 = np.random.normal(0, 0.1, size=( + teacher_w2.shape[0], n) + teacher_w2.shape[2:]) + elif init == 'net2wider': + index = np.random.randint(teacher_w1.shape[0], size=n) + factors = np.bincount(index)[index] + 1. + new_w1 = teacher_w1[index, :, :, :] + new_b1 = teacher_b1[index] + new_w2 = teacher_w2[:, index, :, :] / factors.reshape((1, -1, 1, 1)) + else: + raise ValueError('Unsupported weight initializer: %s' % init) + + student_w1 = np.concatenate((teacher_w1, new_w1), axis=0) + if init == 'random-pad': + student_w2 = np.concatenate((teacher_w2, new_w2), axis=1) + elif init == 'net2wider': + # add small noise to break symmetry, so that student model will have + # full capacity later + noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape) + student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=1) + student_w2[:, index, :, :] = new_w2 + student_b1 = np.concatenate((teacher_b1, new_b1), axis=0) + + return student_w1, student_b1, student_w2 + + +def wider2net_fc(teacher_w1, teacher_b1, teacher_w2, new_width, init): + '''Get initial weights for a wider fully connected (dense) layer + with a bigger nout, by 'random-padding' or 'net2wider'. + + # Arguments + teacher_w1: `weight` of fc layer to become wider, + of shape (nin1, nout1) + teacher_b1: `bias` of fc layer to become wider, + of shape (nout1, ) + teacher_w2: `weight` of next connected fc layer, + of shape (nin2, nout2) + new_width: new `nout` for the wider fc layer + init: initialization algorithm for new weights, + either 'random-pad' or 'net2wider' + ''' + assert teacher_w1.shape[1] == teacher_w2.shape[0], ( + 'successive layers from teacher model should have compatible shapes') + assert teacher_w1.shape[1] == teacher_b1.shape[0], ( + 'weight and bias from same layer should have compatible shapes') + assert new_width > teacher_w1.shape[1], ( + 'new width (nout) should be bigger than the existing one') + + n = new_width - teacher_w1.shape[1] + if init == 'random-pad': + new_w1 = np.random.normal(0, 0.1, size=(teacher_w1.shape[0], n)) + new_b1 = np.ones(n) * 0.1 + new_w2 = np.random.normal(0, 0.1, size=(n, teacher_w2.shape[1])) + elif init == 'net2wider': + index = np.random.randint(teacher_w1.shape[1], size=n) + factors = np.bincount(index)[index] + 1. + new_w1 = teacher_w1[:, index] + new_b1 = teacher_b1[index] + new_w2 = teacher_w2[index, :] / factors[:, np.newaxis] + else: + raise ValueError('Unsupported weight initializer: %s' % init) + + student_w1 = np.concatenate((teacher_w1, new_w1), axis=1) + if init == 'random-pad': + student_w2 = np.concatenate((teacher_w2, new_w2), axis=0) + elif init == 'net2wider': + # add small noise to break symmetry, so that student model will have + # full capacity later + noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape) + student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=0) + student_w2[index, :] = new_w2 + student_b1 = np.concatenate((teacher_b1, new_b1), axis=0) + + return student_w1, student_b1, student_w2 + + +def deeper2net_conv2d(teacher_w): + '''Get initial weights for a deeper conv2d layer by net2deeper'. + + # Arguments + teacher_w: `weight` of previous conv2d layer, + of shape (filters, num_channel, kh, kw) + ''' + filters, num_channel, kh, kw = teacher_w.shape + student_w = np.zeros((filters, filters, kh, kw)) + for i in xrange(filters): + student_w[i, i, (kh - 1) / 2, (kw - 1) / 2] = 1. + student_b = np.zeros(filters) + return student_w, student_b + + +def copy_weights(teacher_model, student_model, layer_names): + '''Copy weights from teacher_model to student_model, + for layers with names listed in layer_names + ''' + for name in layer_names: + weights = teacher_model.get_layer(name=name).get_weights() + student_model.get_layer(name=name).set_weights(weights) + + +# methods to construct teacher_model and student_models +def make_teacher_model(train_data, validation_data, epochs=3): + '''Train a simple CNN as teacher model. + ''' + model = Sequential() + model.add(Conv2D(64, 3, input_shape=input_shape, + padding='same', name='conv1')) + model.add(MaxPooling2D(2, name='pool1')) + model.add(Conv2D(64, 3, padding='same', name='conv2')) + model.add(MaxPooling2D(2, name='pool2')) + model.add(Flatten(name='flatten')) + model.add(Dense(64, activation='relu', name='fc1')) + model.add(Dense(num_class, activation='softmax', name='fc2')) + model.compile(loss='categorical_crossentropy', + optimizer=SGD(lr=0.01, momentum=0.9), + metrics=['accuracy']) + + train_x, train_y = train_data + history = model.fit(train_x, train_y, + epochs=epochs, + validation_data=validation_data) + return model, history + + +def make_wider_student_model(teacher_model, train_data, + validation_data, init, epochs=3): + '''Train a wider student model based on teacher_model, + with either 'random-pad' (baseline) or 'net2wider' + ''' + new_conv1_width = 128 + new_fc1_width = 128 + + model = Sequential() + # a wider conv1 compared to teacher_model + model.add(Conv2D(new_conv1_width, 3, input_shape=input_shape, + padding='same', name='conv1')) + model.add(MaxPooling2D(2, name='pool1')) + model.add(Conv2D(64, 3, padding='same', name='conv2')) + model.add(MaxPooling2D(2, name='pool2')) + model.add(Flatten(name='flatten')) + # a wider fc1 compared to teacher model + model.add(Dense(new_fc1_width, activation='relu', name='fc1')) + model.add(Dense(num_class, activation='softmax', name='fc2')) + + # The weights for other layers need to be copied from teacher_model + # to student_model, except for widened layers + # and their immediate downstreams, which will be initialized separately. + # For this example there are no other layers that need to be copied. + + w_conv1, b_conv1 = teacher_model.get_layer('conv1').get_weights() + w_conv2, b_conv2 = teacher_model.get_layer('conv2').get_weights() + new_w_conv1, new_b_conv1, new_w_conv2 = wider2net_conv2d( + w_conv1, b_conv1, w_conv2, new_conv1_width, init) + model.get_layer('conv1').set_weights([new_w_conv1, new_b_conv1]) + model.get_layer('conv2').set_weights([new_w_conv2, b_conv2]) + + w_fc1, b_fc1 = teacher_model.get_layer('fc1').get_weights() + w_fc2, b_fc2 = teacher_model.get_layer('fc2').get_weights() + new_w_fc1, new_b_fc1, new_w_fc2 = wider2net_fc( + w_fc1, b_fc1, w_fc2, new_fc1_width, init) + model.get_layer('fc1').set_weights([new_w_fc1, new_b_fc1]) + model.get_layer('fc2').set_weights([new_w_fc2, b_fc2]) + + model.compile(loss='categorical_crossentropy', + optimizer=SGD(lr=0.001, momentum=0.9), + metrics=['accuracy']) + + train_x, train_y = train_data + history = model.fit(train_x, train_y, + epochs=epochs, + validation_data=validation_data) + return model, history + + +def make_deeper_student_model(teacher_model, train_data, + validation_data, init, epochs=3): + '''Train a deeper student model based on teacher_model, + with either 'random-init' (baseline) or 'net2deeper' + ''' + model = Sequential() + model.add(Conv2D(64, 3, input_shape=input_shape, + padding='same', name='conv1')) + model.add(MaxPooling2D(2, name='pool1')) + model.add(Conv2D(64, 3, padding='same', name='conv2')) + # add another conv2d layer to make original conv2 deeper + if init == 'net2deeper': + prev_w, _ = model.get_layer('conv2').get_weights() + new_weights = deeper2net_conv2d(prev_w) + model.add(Conv2D(64, 3, padding='same', + name='conv2-deeper', weights=new_weights)) + elif init == 'random-init': + model.add(Conv2D(64, 3, padding='same', name='conv2-deeper')) + else: + raise ValueError('Unsupported weight initializer: %s' % init) + model.add(MaxPooling2D(2, name='pool2')) + model.add(Flatten(name='flatten')) + model.add(Dense(64, activation='relu', name='fc1')) + # add another fc layer to make original fc1 deeper + if init == 'net2deeper': + # net2deeper for fc layer with relu, is just an identity initializer + model.add(Dense(64, init='identity', + activation='relu', name='fc1-deeper')) + elif init == 'random-init': + model.add(Dense(64, activation='relu', name='fc1-deeper')) + else: + raise ValueError('Unsupported weight initializer: %s' % init) + model.add(Dense(num_class, activation='softmax', name='fc2')) + + # copy weights for other layers + copy_weights(teacher_model, model, layer_names=[ + 'conv1', 'conv2', 'fc1', 'fc2']) + + model.compile(loss='categorical_crossentropy', + optimizer=SGD(lr=0.001, momentum=0.9), + metrics=['accuracy']) + + train_x, train_y = train_data + history = model.fit(train_x, train_y, + epochs=epochs, + validation_data=validation_data) + return model, history + + +# experiments setup +def net2wider_experiment(): + '''Benchmark performances of + (1) a teacher model, + (2) a wider student model with `random_pad` initializer + (3) a wider student model with `Net2WiderNet` initializer + ''' + train_data = (train_x, train_y) + validation_data = (validation_x, validation_y) + print('\nExperiment of Net2WiderNet ...') + print('\nbuilding teacher model ...') + teacher_model, _ = make_teacher_model(train_data, + validation_data, + epochs=3) + + print('\nbuilding wider student model by random padding ...') + make_wider_student_model(teacher_model, train_data, + validation_data, 'random-pad', + epochs=3) + print('\nbuilding wider student model by net2wider ...') + make_wider_student_model(teacher_model, train_data, + validation_data, 'net2wider', + epochs=3) + + +def net2deeper_experiment(): + '''Benchmark performances of + (1) a teacher model, + (2) a deeper student model with `random_init` initializer + (3) a deeper student model with `Net2DeeperNet` initializer + ''' + train_data = (train_x, train_y) + validation_data = (validation_x, validation_y) + print('\nExperiment of Net2DeeperNet ...') + print('\nbuilding teacher model ...') + teacher_model, _ = make_teacher_model(train_data, + validation_data, + epochs=3) + + print('\nbuilding deeper student model by random init ...') + make_deeper_student_model(teacher_model, train_data, + validation_data, 'random-init', + epochs=3) + print('\nbuilding deeper student model by net2deeper ...') + make_deeper_student_model(teacher_model, train_data, + validation_data, 'net2deeper', + epochs=3) + +# run the experiments +net2wider_experiment() +net2deeper_experiment() diff --git a/website/articles/examples/mnist_siamese_graph.R b/website/articles/examples/mnist_siamese_graph.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/mnist_siamese_graph.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/mnist_siamese_graph.html b/website/articles/examples/mnist_siamese_graph.html new file mode 100644 index 000000000..2c9207867 --- /dev/null +++ b/website/articles/examples/mnist_siamese_graph.html @@ -0,0 +1,137 @@ + + + + + + + +mnist_siamese_graph • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_siamese_graph.py b/website/articles/examples/mnist_siamese_graph.py new file mode 100755 index 000000000..7448e7fd6 --- /dev/null +++ b/website/articles/examples/mnist_siamese_graph.py @@ -0,0 +1,131 @@ +'''Train a Siamese MLP on pairs of digits from the MNIST dataset. + +It follows Hadsell-et-al.'06 [1] by computing the Euclidean distance on the +output of the shared network and by optimizing the contrastive loss (see paper +for mode details). + +[1] "Dimensionality Reduction by Learning an Invariant Mapping" + http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf + +Gets to 99.5% test accuracy after 20 epochs. +3 seconds per epoch on a Titan X GPU +''' +from __future__ import absolute_import +from __future__ import print_function +import numpy as np + +import random +from keras.datasets import mnist +from keras.models import Sequential, Model +from keras.layers import Dense, Dropout, Input, Lambda +from keras.optimizers import RMSprop +from keras import backend as K + + +def euclidean_distance(vects): + x, y = vects + return K.sqrt(K.maximum(K.sum(K.square(x - y), axis=1, keepdims=True), K.epsilon())) + + +def eucl_dist_output_shape(shapes): + shape1, shape2 = shapes + return (shape1[0], 1) + + +def contrastive_loss(y_true, y_pred): + '''Contrastive loss from Hadsell-et-al.'06 + http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf + ''' + margin = 1 + return K.mean(y_true * K.square(y_pred) + + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0))) + + +def create_pairs(x, digit_indices): + '''Positive and negative pair creation. + Alternates between positive and negative pairs. + ''' + pairs = [] + labels = [] + n = min([len(digit_indices[d]) for d in range(10)]) - 1 + for d in range(10): + for i in range(n): + z1, z2 = digit_indices[d][i], digit_indices[d][i + 1] + pairs += [[x[z1], x[z2]]] + inc = random.randrange(1, 10) + dn = (d + inc) % 10 + z1, z2 = digit_indices[d][i], digit_indices[dn][i] + pairs += [[x[z1], x[z2]]] + labels += [1, 0] + return np.array(pairs), np.array(labels) + + +def create_base_network(input_dim): + '''Base network to be shared (eq. to feature extraction). + ''' + seq = Sequential() + seq.add(Dense(128, input_shape=(input_dim,), activation='relu')) + seq.add(Dropout(0.1)) + seq.add(Dense(128, activation='relu')) + seq.add(Dropout(0.1)) + seq.add(Dense(128, activation='relu')) + return seq + + +def compute_accuracy(predictions, labels): + '''Compute classification accuracy with a fixed threshold on distances. + ''' + return labels[predictions.ravel() < 0.5].mean() + + +# the data, shuffled and split between train and test sets +(x_train, y_train), (x_test, y_test) = mnist.load_data() +x_train = x_train.reshape(60000, 784) +x_test = x_test.reshape(10000, 784) +x_train = x_train.astype('float32') +x_test = x_test.astype('float32') +x_train /= 255 +x_test /= 255 +input_dim = 784 +epochs = 20 + +# create training+test positive and negative pairs +digit_indices = [np.where(y_train == i)[0] for i in range(10)] +tr_pairs, tr_y = create_pairs(x_train, digit_indices) + +digit_indices = [np.where(y_test == i)[0] for i in range(10)] +te_pairs, te_y = create_pairs(x_test, digit_indices) + +# network definition +base_network = create_base_network(input_dim) + +input_a = Input(shape=(input_dim,)) +input_b = Input(shape=(input_dim,)) + +# because we re-use the same instance `base_network`, +# the weights of the network +# will be shared across the two branches +processed_a = base_network(input_a) +processed_b = base_network(input_b) + +distance = Lambda(euclidean_distance, + output_shape=eucl_dist_output_shape)([processed_a, processed_b]) + +model = Model([input_a, input_b], distance) + +# train +rms = RMSprop() +model.compile(loss=contrastive_loss, optimizer=rms) +model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y, + batch_size=128, + epochs=epochs, + validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y)) + +# compute final accuracy on training and test sets +pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]]) +tr_acc = compute_accuracy(pred, tr_y) +pred = model.predict([te_pairs[:, 0], te_pairs[:, 1]]) +te_acc = compute_accuracy(pred, te_y) + +print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc)) +print('* Accuracy on test set: %0.2f%%' % (100 * te_acc)) diff --git a/website/articles/examples/mnist_swwae.R b/website/articles/examples/mnist_swwae.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/mnist_swwae.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/mnist_swwae.html b/website/articles/examples/mnist_swwae.html new file mode 100644 index 000000000..735070c55 --- /dev/null +++ b/website/articles/examples/mnist_swwae.html @@ -0,0 +1,137 @@ + + + + + + + +mnist_swwae • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/mnist_swwae.py b/website/articles/examples/mnist_swwae.py new file mode 100755 index 000000000..80b34e216 --- /dev/null +++ b/website/articles/examples/mnist_swwae.py @@ -0,0 +1,203 @@ +'''Trains a stacked what-where autoencoder built on residual blocks on the +MNIST dataset. It exemplifies two influential methods that have been developed +in the past few years. + +The first is the idea of properly 'unpooling.' During any max pool, the +exact location (the 'where') of the maximal value in a pooled receptive field +is lost, however it can be very useful in the overall reconstruction of an +input image. Therefore, if the 'where' is handed from the encoder +to the corresponding decoder layer, features being decoded can be 'placed' in +the right location, allowing for reconstructions of much higher fidelity. + +References: +[1] +'Visualizing and Understanding Convolutional Networks' +Matthew D Zeiler, Rob Fergus +https://arxiv.org/abs/1311.2901v3 + +[2] +'Stacked What-Where Auto-encoders' +Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun +https://arxiv.org/abs/1506.02351v8 + +The second idea exploited here is that of residual learning. Residual blocks +ease the training process by allowing skip connections that give the network +the ability to be as linear (or non-linear) as the data sees fit. This allows +for much deep networks to be easily trained. The residual element seems to +be advantageous in the context of this example as it allows a nice symmetry +between the encoder and decoder. Normally, in the decoder, the final +projection to the space where the image is reconstructed is linear, however +this does not have to be the case for a residual block as the degree to which +its output is linear or non-linear is determined by the data it is fed. +However, in order to cap the reconstruction in this example, a hard softmax is +applied as a bias because we know the MNIST digits are mapped to [0,1]. + +References: +[3] +'Deep Residual Learning for Image Recognition' +Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun +https://arxiv.org/abs/1512.03385v1 + +[4] +'Identity Mappings in Deep Residual Networks' +Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun +https://arxiv.org/abs/1603.05027v3 + +''' +from __future__ import print_function +import numpy as np + +from keras.datasets import mnist +from keras.models import Model +from keras.layers import Activation +from keras.layers import UpSampling2D, Conv2D, MaxPooling2D +from keras.layers import Input, BatchNormalization, ELU +import matplotlib.pyplot as plt +import keras.backend as K +from keras import layers + + +def convresblock(x, nfeats=8, ksize=3, nskipped=2, elu=True): + """The proposed residual block from [4]. + + Running with elu=True will use ELU nonlinearity and running with + elu=False will use BatchNorm + RELU nonlinearity. While ELU's are fast + due to the fact they do not suffer from BatchNorm overhead, they may + overfit because they do not offer the stochastic element of the batch + formation process of BatchNorm, which acts as a good regularizer. + + # Arguments + x: 4D tensor, the tensor to feed through the block + nfeats: Integer, number of feature maps for conv layers. + ksize: Integer, width and height of conv kernels in first convolution. + nskipped: Integer, number of conv layers for the residual function. + elu: Boolean, whether to use ELU or BN+RELU. + + # Input shape + 4D tensor with shape: + `(batch, channels, rows, cols)` + + # Output shape + 4D tensor with shape: + `(batch, filters, rows, cols)` + """ + y0 = Conv2D(nfeats, ksize, padding='same')(x) + y = y0 + for i in range(nskipped): + if elu: + y = ELU()(y) + else: + y = BatchNormalization(axis=1)(y) + y = Activation('relu')(y) + y = Conv2D(nfeats, 1, padding='same')(y) + return layers.add([y0, y]) + + +def getwhere(x): + ''' Calculate the 'where' mask that contains switches indicating which + index contained the max value when MaxPool2D was applied. Using the + gradient of the sum is a nice trick to keep everything high level.''' + y_prepool, y_postpool = x + return K.gradients(K.sum(y_postpool), y_prepool) + +if K.backend() == 'tensorflow': + raise RuntimeError('This example can only run with the ' + 'Theano backend for the time being, ' + 'because it requires taking the gradient ' + 'of a gradient, which isn\'t ' + 'supported for all TF ops.') + +# This example assume 'channels_first' data format. +K.set_image_data_format('channels_first') + +# input image dimensions +img_rows, img_cols = 28, 28 + +# the data, shuffled and split between train and test sets +(x_train, _), (x_test, _) = mnist.load_data() + +x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) +x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) +x_train = x_train.astype('float32') +x_test = x_test.astype('float32') +x_train /= 255 +x_test /= 255 +print('x_train shape:', x_train.shape) +print(x_train.shape[0], 'train samples') +print(x_test.shape[0], 'test samples') + +# The size of the kernel used for the MaxPooling2D +pool_size = 2 +# The total number of feature maps at each layer +nfeats = [8, 16, 32, 64, 128] +# The sizes of the pooling kernel at each layer +pool_sizes = np.array([1, 1, 1, 1, 1]) * pool_size +# The convolution kernel size +ksize = 3 +# Number of epochs to train for +epochs = 5 +# Batch size during training +batch_size = 128 + +if pool_size == 2: + # if using a 5 layer net of pool_size = 2 + x_train = np.pad(x_train, [[0, 0], [0, 0], [2, 2], [2, 2]], + mode='constant') + x_test = np.pad(x_test, [[0, 0], [0, 0], [2, 2], [2, 2]], mode='constant') + nlayers = 5 +elif pool_size == 3: + # if using a 3 layer net of pool_size = 3 + x_train = x_train[:, :, :-1, :-1] + x_test = x_test[:, :, :-1, :-1] + nlayers = 3 +else: + import sys + sys.exit('Script supports pool_size of 2 and 3.') + +# Shape of input to train on (note that model is fully convolutional however) +input_shape = x_train.shape[1:] +# The final list of the size of axis=1 for all layers, including input +nfeats_all = [input_shape[0]] + nfeats + +# First build the encoder, all the while keeping track of the 'where' masks +img_input = Input(shape=input_shape) + +# We push the 'where' masks to the following list +wheres = [None] * nlayers +y = img_input +for i in range(nlayers): + y_prepool = convresblock(y, nfeats=nfeats_all[i + 1], ksize=ksize) + y = MaxPooling2D(pool_size=(pool_sizes[i], pool_sizes[i]))(y_prepool) + wheres[i] = layers.Lambda( + getwhere, output_shape=lambda x: x[0])([y_prepool, y]) + +# Now build the decoder, and use the stored 'where' masks to place the features +for i in range(nlayers): + ind = nlayers - 1 - i + y = UpSampling2D(size=(pool_sizes[ind], pool_sizes[ind]))(y) + y = layers.multiply([y, wheres[ind]]) + y = convresblock(y, nfeats=nfeats_all[ind], ksize=ksize) + +# Use hard_simgoid to clip range of reconstruction +y = Activation('hard_sigmoid')(y) + +# Define the model and it's mean square error loss, and compile it with Adam +model = Model(img_input, y) +model.compile('adam', 'mse') + +# Fit the model +model.fit(x_train, x_train, + batch_size=batch_size, + epochs=epochs, + validation_data=(x_test, x_test)) + +# Plot +x_recon = model.predict(x_test[:25]) +x_plot = np.concatenate((x_test[:25], x_recon), axis=1) +x_plot = x_plot.reshape((5, 10, input_shape[-2], input_shape[-1])) +x_plot = np.vstack([np.hstack(x) for x in x_plot]) +plt.figure() +plt.axis('off') +plt.title('Test Samples: Originals/Reconstructions') +plt.imshow(x_plot, interpolation='none', cmap='gray') +plt.savefig('reconstructions.png') diff --git a/website/articles/examples/mnist_transfer_cnn.R b/website/articles/examples/mnist_transfer_cnn.R new file mode 100644 index 000000000..de42bdf58 --- /dev/null +++ b/website/articles/examples/mnist_transfer_cnn.R @@ -0,0 +1,87 @@ +#' Transfer learning toy example: +#' +#' 1) Train a simple convnet on the MNIST dataset the first 5 digits [0..4]. +#' 2) Freeze convolutional layers and fine-tune dense layers +#' for the classification of digits [5..9]. +#' + +library(keras) + +now <- Sys.time() + +batch_size <- 128 +num_classes <- 5 +epochs <- 5 + +# input image dimensions +img_rows <- 28 +img_cols <- 28 + +# number of convolutional filters to use +filters <- 32 + +# size of pooling area for max pooling +pool_size <- 2 + +# convolution kernel size +kernel_size <- c(3, 3) + +# input shape +input_shape <- c(img_rows, img_cols, 1) + +# the data, shuffled and split between train and test sets +data <- dataset_mnist() +x_train <- data$train$x +y_train <- data$train$y +x_test <- data$test$x +y_test <- data$test$y + +# create two datasets one with digits below 5 and one with 5 and above +x_train_lt5 <- x_train[y_train < 5] +y_train_lt5 <- y_train[y_train < 5] +x_test_lt5 <- x_test[y_test < 5] +y_test_lt5 <- y_test[y_test < 5] + +x_train_gte5 <- x_train[y_train >= 5] +y_train_gte5 <- y_train[y_train >= 5] - 5 +x_test_gte5 <- x_test[y_test >= 5] +y_test_gte5 <- y_test[y_test >= 5] - 5 + +# define two groups of layers: feature (convolutions) and classification (dense) +feature_layers <- + layer_conv_2d(filters = filters, kernel_size = kernel_size, + input_shape = input_shape) %>% + layer_activation(activation = 'relu') %>% + layer_conv_2d(filters = filters, kernel_size = kernel_size) %>% + layer_activation(activation = 'relu') %>% + layer_max_pooling_2d(pool_size = pool_size) %>% + layer_dropout(rate = 0.25) %>% + layer_flatten() + + + +# feature_layers = [ +# Conv2D(filters, kernel_size, +# padding='valid', +# input_shape=input_shape), +# Activation('relu'), +# Conv2D(filters, kernel_size), +# Activation('relu'), +# MaxPooling2D(pool_size=pool_size), +# Dropout(0.25), +# Flatten(), +# ] +# +# classification_layers = [ +# Dense(128), +# Activation('relu'), +# Dropout(0.5), +# Dense(num_classes), +# Activation('softmax') +# ] + + + + + + diff --git a/website/articles/examples/mnist_transfer_cnn.html b/website/articles/examples/mnist_transfer_cnn.html new file mode 100644 index 000000000..6da655d0e --- /dev/null +++ b/website/articles/examples/mnist_transfer_cnn.html @@ -0,0 +1,215 @@ + + + + + + + +mnist_transfer_cnn • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Transfer learning toy example:

+
    +
  1. Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
  2. +
  3. Freeze convolutional layers and fine-tune dense layers for the classification of digits [5..9].
  4. +
+
library(keras)
+
+now <- Sys.time()
+
+batch_size <- 128
+num_classes <- 5
+epochs <- 5
+
+# input image dimensions
+img_rows <- 28
+img_cols <- 28
+
+# number of convolutional filters to use
+filters <- 32
+
+# size of pooling area for max pooling
+pool_size <- 2
+
+# convolution kernel size
+kernel_size <- c(3, 3)
+
+# input shape
+input_shape <- c(img_rows, img_cols, 1)
+
+# the data, shuffled and split between train and test sets
+data <- dataset_mnist()
+x_train <- data$train$x
+y_train <- data$train$y
+x_test <- data$test$x
+y_test <- data$test$y
+
+# create two datasets one with digits below 5 and one with 5 and above
+x_train_lt5 <- x_train[y_train < 5]
+y_train_lt5 <- y_train[y_train < 5]
+x_test_lt5 <- x_test[y_test < 5]
+y_test_lt5 <- y_test[y_test < 5]
+
+x_train_gte5 <- x_train[y_train >= 5]
+y_train_gte5 <- y_train[y_train >= 5] - 5
+x_test_gte5 <- x_test[y_test >= 5]
+y_test_gte5 <- y_test[y_test >= 5] - 5
+
+# define two groups of layers: feature (convolutions) and classification (dense)
+feature_layers <- 
+  layer_conv_2d(filters = filters, kernel_size = kernel_size, 
+                input_shape = input_shape) %>% 
+  layer_activation(activation = 'relu') %>% 
+  layer_conv_2d(filters = filters, kernel_size = kernel_size) %>% 
+  layer_activation(activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = pool_size) %>% 
+  layer_dropout(rate = 0.25) %>% 
+  layer_flatten()
+  
+
+
+# feature_layers = [
+#   Conv2D(filters, kernel_size,
+#          padding='valid',
+#          input_shape=input_shape),
+#   Activation('relu'),
+#   Conv2D(filters, kernel_size),
+#   Activation('relu'),
+#   MaxPooling2D(pool_size=pool_size),
+#   Dropout(0.25),
+#   Flatten(),
+#   ]
+# 
+# classification_layers = [
+#   Dense(128),
+#   Activation('relu'),
+#   Dropout(0.5),
+#   Dense(num_classes),
+#   Activation('softmax')
+#   ]
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/neural-style-base-img.png b/website/articles/examples/neural-style-base-img.png new file mode 100644 index 000000000..bc0454ae3 Binary files /dev/null and b/website/articles/examples/neural-style-base-img.png differ diff --git a/website/articles/examples/neural-style-style.jpg b/website/articles/examples/neural-style-style.jpg new file mode 100644 index 000000000..633282da5 Binary files /dev/null and b/website/articles/examples/neural-style-style.jpg differ diff --git a/website/articles/examples/neural_doodle.py b/website/articles/examples/neural_doodle.py new file mode 100755 index 000000000..c4133d8fe --- /dev/null +++ b/website/articles/examples/neural_doodle.py @@ -0,0 +1,366 @@ +'''Neural doodle with Keras + +Script Usage: + # Arguments: + ``` + --nlabels: # of regions (colors) in mask images + --style-image: image to learn style from + --style-mask: semantic labels for style image + --target-mask: semantic labels for target image (your doodle) + --content-image: optional image to learn content from + --target-image-prefix: path prefix for generated target images + ``` + + # Example 1: doodle using a style image, style mask + and target mask. + ``` + python neural_doodle.py --nlabels 4 --style-image Monet/style.png \ + --style-mask Monet/style_mask.png --target-mask Monet/target_mask.png \ + --target-image-prefix generated/monet + ``` + + # Example 2: doodle using a style image, style mask, + target mask and an optional content image. + ``` + python neural_doodle.py --nlabels 4 --style-image Renoir/style.png \ + --style-mask Renoir/style_mask.png --target-mask Renoir/target_mask.png \ + --content-image Renoir/creek.jpg \ + --target-image-prefix generated/renoir + ``` + +References: +[Dmitry Ulyanov's blog on fast-neural-doodle](http://dmitryulyanov.github.io/feed-forward-neural-doodle/) +[Torch code for fast-neural-doodle](https://github.com/DmitryUlyanov/fast-neural-doodle) +[Torch code for online-neural-doodle](https://github.com/DmitryUlyanov/online-neural-doodle) +[Paper Texture Networks: Feed-forward Synthesis of Textures and Stylized Images](http://arxiv.org/abs/1603.03417) +[Discussion on parameter tuning](https://github.com/fchollet/keras/issues/3705) + +Resources: +Example images can be downloaded from +https://github.com/DmitryUlyanov/fast-neural-doodle/tree/master/data +''' +from __future__ import print_function +import time +import argparse +import numpy as np +from scipy.optimize import fmin_l_bfgs_b +from scipy.misc import imread, imsave + +from keras import backend as K +from keras.layers import Input, AveragePooling2D +from keras.models import Model +from keras.preprocessing.image import load_img, img_to_array +from keras.applications import vgg19 + +# Command line arguments +parser = argparse.ArgumentParser(description='Keras neural doodle example') +parser.add_argument('--nlabels', type=int, + help='number of semantic labels' + ' (regions in differnet colors)' + ' in style_mask/target_mask') +parser.add_argument('--style-image', type=str, + help='path to image to learn style from') +parser.add_argument('--style-mask', type=str, + help='path to semantic mask of style image') +parser.add_argument('--target-mask', type=str, + help='path to semantic mask of target image') +parser.add_argument('--content-image', type=str, default=None, + help='path to optional content image') +parser.add_argument('--target-image-prefix', type=str, + help='path prefix for generated results') +args = parser.parse_args() + +style_img_path = args.style_image +style_mask_path = args.style_mask +target_mask_path = args.target_mask +content_img_path = args.content_image +target_img_prefix = args.target_image_prefix +use_content_img = content_img_path is not None + +num_labels = args.nlabels +num_colors = 3 # RGB +# determine image sizes based on target_mask +ref_img = imread(target_mask_path) +img_nrows, img_ncols = ref_img.shape[:2] + +total_variation_weight = 50. +style_weight = 1. +content_weight = 0.1 if use_content_img else 0 + +content_feature_layers = ['block5_conv2'] +# To get better generation qualities, use more conv layers for style features +style_feature_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1', + 'block4_conv1', 'block5_conv1'] + + +# helper functions for reading/processing images +def preprocess_image(image_path): + img = load_img(image_path, target_size=(img_nrows, img_ncols)) + img = img_to_array(img) + img = np.expand_dims(img, axis=0) + img = vgg19.preprocess_input(img) + return img + + +def deprocess_image(x): + if K.image_data_format() == 'channels_first': + x = x.reshape((3, img_nrows, img_ncols)) + x = x.transpose((1, 2, 0)) + else: + x = x.reshape((img_nrows, img_ncols, 3)) + # Remove zero-center by mean pixel + x[:, :, 0] += 103.939 + x[:, :, 1] += 116.779 + x[:, :, 2] += 123.68 + # 'BGR'->'RGB' + x = x[:, :, ::-1] + x = np.clip(x, 0, 255).astype('uint8') + return x + + +def kmeans(xs, k): + assert xs.ndim == 2 + try: + from sklearn.cluster import k_means + _, labels, _ = k_means(xs.astype('float64'), k) + except ImportError: + from scipy.cluster.vq import kmeans2 + _, labels = kmeans2(xs, k, missing='raise') + return labels + + +def load_mask_labels(): + '''Load both target and style masks. + A mask image (nr x nc) with m labels/colors will be loaded + as a 4D boolean tensor: (1, m, nr, nc) for 'channels_first' or (1, nr, nc, m) for 'channels_last' + ''' + target_mask_img = load_img(target_mask_path, + target_size=(img_nrows, img_ncols)) + target_mask_img = img_to_array(target_mask_img) + style_mask_img = load_img(style_mask_path, + target_size=(img_nrows, img_ncols)) + style_mask_img = img_to_array(style_mask_img) + if K.image_data_format() == 'channels_first': + mask_vecs = np.vstack([style_mask_img.reshape((3, -1)).T, + target_mask_img.reshape((3, -1)).T]) + else: + mask_vecs = np.vstack([style_mask_img.reshape((-1, 3)), + target_mask_img.reshape((-1, 3))]) + + labels = kmeans(mask_vecs, num_labels) + style_mask_label = labels[:img_nrows * + img_ncols].reshape((img_nrows, img_ncols)) + target_mask_label = labels[img_nrows * + img_ncols:].reshape((img_nrows, img_ncols)) + + stack_axis = 0 if K.image_data_format() == 'channels_first' else -1 + style_mask = np.stack([style_mask_label == r for r in xrange(num_labels)], + axis=stack_axis) + target_mask = np.stack([target_mask_label == r for r in xrange(num_labels)], + axis=stack_axis) + + return (np.expand_dims(style_mask, axis=0), + np.expand_dims(target_mask, axis=0)) + +# Create tensor variables for images +if K.image_data_format() == 'channels_first': + shape = (1, num_colors, img_nrows, img_ncols) +else: + shape = (1, img_nrows, img_ncols, num_colors) + +style_image = K.variable(preprocess_image(style_img_path)) +target_image = K.placeholder(shape=shape) +if use_content_img: + content_image = K.variable(preprocess_image(content_img_path)) +else: + content_image = K.zeros(shape=shape) + +images = K.concatenate([style_image, target_image, content_image], axis=0) + +# Create tensor variables for masks +raw_style_mask, raw_target_mask = load_mask_labels() +style_mask = K.variable(raw_style_mask.astype('float32')) +target_mask = K.variable(raw_target_mask.astype('float32')) +masks = K.concatenate([style_mask, target_mask], axis=0) + +# index constants for images and tasks variables +STYLE, TARGET, CONTENT = 0, 1, 2 + +# Build image model, mask model and use layer outputs as features +# image model as VGG19 +image_model = vgg19.VGG19(include_top=False, input_tensor=images) + +# mask model as a series of pooling +mask_input = Input(tensor=masks, shape=(None, None, None), name='mask_input') +x = mask_input +for layer in image_model.layers[1:]: + name = 'mask_%s' % layer.name + if 'conv' in layer.name: + x = AveragePooling2D((3, 3), strides=( + 1, 1), name=name, border_mode='same')(x) + elif 'pool' in layer.name: + x = AveragePooling2D((2, 2), name=name)(x) +mask_model = Model(mask_input, x) + +# Collect features from image_model and task_model +image_features = {} +mask_features = {} +for img_layer, mask_layer in zip(image_model.layers, mask_model.layers): + if 'conv' in img_layer.name: + assert 'mask_' + img_layer.name == mask_layer.name + layer_name = img_layer.name + img_feat, mask_feat = img_layer.output, mask_layer.output + image_features[layer_name] = img_feat + mask_features[layer_name] = mask_feat + + +# Define loss functions +def gram_matrix(x): + assert K.ndim(x) == 3 + features = K.batch_flatten(x) + gram = K.dot(features, K.transpose(features)) + return gram + + +def region_style_loss(style_image, target_image, style_mask, target_mask): + '''Calculate style loss between style_image and target_image, + for one common region specified by their (boolean) masks + ''' + assert 3 == K.ndim(style_image) == K.ndim(target_image) + assert 2 == K.ndim(style_mask) == K.ndim(target_mask) + if K.image_data_format() == 'channels_first': + masked_style = style_image * style_mask + masked_target = target_image * target_mask + num_channels = K.shape(style_image)[0] + else: + masked_style = K.permute_dimensions( + style_image, (2, 0, 1)) * style_mask + masked_target = K.permute_dimensions( + target_image, (2, 0, 1)) * target_mask + num_channels = K.shape(style_image)[-1] + s = gram_matrix(masked_style) / K.mean(style_mask) / num_channels + c = gram_matrix(masked_target) / K.mean(target_mask) / num_channels + return K.mean(K.square(s - c)) + + +def style_loss(style_image, target_image, style_masks, target_masks): + '''Calculate style loss between style_image and target_image, + in all regions. + ''' + assert 3 == K.ndim(style_image) == K.ndim(target_image) + assert 3 == K.ndim(style_masks) == K.ndim(target_masks) + loss = K.variable(0) + for i in xrange(num_labels): + if K.image_data_format() == 'channels_first': + style_mask = style_masks[i, :, :] + target_mask = target_masks[i, :, :] + else: + style_mask = style_masks[:, :, i] + target_mask = target_masks[:, :, i] + loss += region_style_loss(style_image, + target_image, style_mask, target_mask) + return loss + + +def content_loss(content_image, target_image): + return K.sum(K.square(target_image - content_image)) + + +def total_variation_loss(x): + assert 4 == K.ndim(x) + if K.image_data_format() == 'channels_first': + a = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] - + x[:, :, 1:, :img_ncols - 1]) + b = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] - + x[:, :, :img_nrows - 1, 1:]) + else: + a = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] - + x[:, 1:, :img_ncols - 1, :]) + b = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] - + x[:, :img_nrows - 1, 1:, :]) + return K.sum(K.pow(a + b, 1.25)) + +# Overall loss is the weighted sum of content_loss, style_loss and tv_loss +# Each individual loss uses features from image/mask models. +loss = K.variable(0) +for layer in content_feature_layers: + content_feat = image_features[layer][CONTENT, :, :, :] + target_feat = image_features[layer][TARGET, :, :, :] + loss += content_weight * content_loss(content_feat, target_feat) + +for layer in style_feature_layers: + style_feat = image_features[layer][STYLE, :, :, :] + target_feat = image_features[layer][TARGET, :, :, :] + style_masks = mask_features[layer][STYLE, :, :, :] + target_masks = mask_features[layer][TARGET, :, :, :] + sl = style_loss(style_feat, target_feat, style_masks, target_masks) + loss += (style_weight / len(style_feature_layers)) * sl + +loss += total_variation_weight * total_variation_loss(target_image) +loss_grads = K.gradients(loss, target_image) + +# Evaluator class for computing efficiency +outputs = [loss] +if isinstance(loss_grads, (list, tuple)): + outputs += loss_grads +else: + outputs.append(loss_grads) + +f_outputs = K.function([target_image], outputs) + + +def eval_loss_and_grads(x): + if K.image_data_format() == 'channels_first': + x = x.reshape((1, 3, img_nrows, img_ncols)) + else: + x = x.reshape((1, img_nrows, img_ncols, 3)) + outs = f_outputs([x]) + loss_value = outs[0] + if len(outs[1:]) == 1: + grad_values = outs[1].flatten().astype('float64') + else: + grad_values = np.array(outs[1:]).flatten().astype('float64') + return loss_value, grad_values + + +class Evaluator(object): + + def __init__(self): + self.loss_value = None + self.grads_values = None + + def loss(self, x): + assert self.loss_value is None + loss_value, grad_values = eval_loss_and_grads(x) + self.loss_value = loss_value + self.grad_values = grad_values + return self.loss_value + + def grads(self, x): + assert self.loss_value is not None + grad_values = np.copy(self.grad_values) + self.loss_value = None + self.grad_values = None + return grad_values + +evaluator = Evaluator() + +# Generate images by iterative optimization +if K.image_data_format() == 'channels_first': + x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128. +else: + x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128. + +for i in range(50): + print('Start of iteration', i) + start_time = time.time() + x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(), + fprime=evaluator.grads, maxfun=20) + print('Current loss value:', min_val) + # save current generated image + img = deprocess_image(x.copy()) + fname = target_img_prefix + '_at_iteration_%d.png' % i + imsave(fname, img) + end_time = time.time() + print('Image saved as', fname) + print('Iteration %d completed in %ds' % (i, end_time - start_time)) diff --git a/website/articles/examples/neural_style_transfer.R b/website/articles/examples/neural_style_transfer.R new file mode 100644 index 000000000..fd1a179a2 --- /dev/null +++ b/website/articles/examples/neural_style_transfer.R @@ -0,0 +1,256 @@ +#' Neural style transfer with Keras. +#' +#' It is preferable to run this script on a GPU, for speed. +#' +#' Example result: https://twitter.com/fchollet/status/686631033085677568 +#' +#' Style transfer consists in generating an image +#' with the same "content" as a base image, but with the +#' "style" of a different picture (typically artistic). +#' +#' This is achieved through the optimization of a loss function +#' that has 3 components: "style loss", "content loss", +#' and "total variation loss": +#' +#' - The total variation loss imposes local spatial continuity between +#' the pixels of the combination image, giving it visual coherence. +#' +#' - The style loss is where the deep learning keeps in --that one is defined +#' using a deep convolutional neural network. Precisely, it consists in a sum of +#' L2 distances between the Gram matrices of the representations of +#' the base image and the style reference image, extracted from +#' different layers of a convnet (trained on ImageNet). The general idea +#' is to capture color/texture information at different spatial +#' scales (fairly large scales --defined by the depth of the layer considered). +#' +#' - The content loss is a L2 distance between the features of the base +#' image (extracted from a deep layer) and the features of the combination image, +#' keeping the generated image close enough to the original one. +#' + +library(keras) +library(purrr) +library(R6) +K <- backend() + +# Parameters -------------------------------------------------------------- + +base_image_path <- "neural-style-base-img.png" +style_reference_image_path <- "neural-style-style.jpg" +iterations <- 10 + +# these are the weights of the different loss components +total_variation_weight <- 1 +style_weight <- 1 +content_weight <- 0.025 + +# dimensions of the generated picture. +img <- image_load(base_image_path) +width <- img$size[[1]] +height <- img$size[[2]] +img_nrows <- 400 +img_ncols <- as.integer(width * img_nrows / height) + + +# Functions --------------------------------------------------------------- + +# util function to open, resize and format pictures into appropriate tensors +preprocess_image <- function(path){ + img <- image_load(path, target_size = c(img_nrows, img_ncols)) %>% + image_to_array() + dim(img) <- c(1, dim(img)) + imagenet_preprocess_input(img) +} + +# util function to convert a tensor into a valid image +deprocess_image <- function(x){ + x <- x[1,,,] + # Remove zero-center by mean pixel + x[,,1] <- x[,,1] + 103.939 + x[,,2] <- x[,,2] + 116.779 + x[,,3] <- x[,,3] + 123.68 + # clip to interval 0, 255 + x[x > 255] <- 255 + x[x < 0] <- 0 + x[] <- as.integer(x)/255 + x +} + +# Defining the model ------------------------------------------------------ + +# get tensor representations of our images +base_image <- K$variable(preprocess_image(base_image_path)) +style_reference_image <- K$variable(preprocess_image(style_reference_image_path)) + +# this will contain our generated image +combination_image <- K$placeholder(c(1L, img_nrows, img_ncols, 3L)) + +# combine the 3 images into a single Keras tensor +input_tensor <- K$concatenate(list(base_image, style_reference_image, + combination_image), axis = 0L) + +# build the VGG16 network with our 3 images as input +# the model will be loaded with pre-trained ImageNet weights +model <- application_vgg16(input_tensor = input_tensor, weights = "imagenet", + include_top = FALSE) + +print("Model loaded.") + +nms <- map_chr(model$layers, ~.x$name) +output_dict <- map(model$layers, ~.x$output) %>% set_names(nms) + +# compute the neural style loss +# first we need to define 4 util functions + +# the gram matrix of an image tensor (feature-wise outer product) + +gram_matrix <- function(x){ + + features <- x %>% + K$permute_dimensions(pattern = c(2L, 0L, 1L)) %>% + K$batch_flatten() + + K$dot(features, K$transpose(features)) +} + +# the "style loss" is designed to maintain +# the style of the reference image in the generated image. +# It is based on the gram matrices (which capture style) of +# feature maps from the style reference image +# and from the generated image + +style_loss <- function(style, combination){ + S <- gram_matrix(style) + C <- gram_matrix(combination) + + channels <- 3 + size <- img_nrows*img_ncols + + K$sum(K$square(S - C)) / (4 * channels^2 * size^2) +} + +# an auxiliary loss function +# designed to maintain the "content" of the +# base image in the generated image + +content_loss <- function(base, combination){ + K$sum(K$square(combination - base)) +} + +# the 3rd loss function, total variation loss, +# designed to keep the generated image locally coherent + +total_variation_loss <- function(x){ + y_ij <- x[,0:(img_nrows - 2L), 0:(img_ncols - 2L),] + y_i1j <- x[,1:(img_nrows - 1L), 0:(img_ncols - 2L),] + y_ij1 <- x[,0:(img_nrows - 2L), 1:(img_ncols - 1L),] + + a <- K$square(y_ij - y_i1j) + b <- K$square(y_ij - y_ij1) + K$sum(K$pow(a + b, 1.25)) +} + +# combine these loss functions into a single scalar +loss <- K$variable(0.0) +layer_features <- output_dict$block4_conv2 +base_image_features <- layer_features[0,,,] +combination_features <- layer_features[2,,,] + +loss <- loss + content_weight*content_loss(base_image_features, + combination_features) + +feature_layers = c('block1_conv1', 'block2_conv1', + 'block3_conv1', 'block4_conv1', + 'block5_conv1') + +for(layer_name in feature_layers){ + layer_features <- output_dict[[layer_name]] + style_reference_features <- layer_features[1,,,] + combination_features <- layer_features[2,,,] + sl <- style_loss(style_reference_features, combination_features) + loss <- loss + ((style_weight / length(feature_layers)) * sl) +} + +loss <- loss + (total_variation_weight * total_variation_loss(combination_image)) + +# get the gradients of the generated image wrt the loss +grads <- K$gradients(loss, combination_image)[[1]] + +f_outputs <- K$`function`(list(combination_image), list(loss, grads)) + +eval_loss_and_grads <- function(image){ + dim(image) <- c(1, img_nrows, img_ncols, 3) + outs <- f_outputs(list(image)) + list( + loss_value = outs[[1]], + grad_values = as.numeric(outs[[2]]) + ) +} + +# Loss and gradients evaluator. +# +# This Evaluator class makes it possible +# to compute loss and gradients in one pass +# while retrieving them via two separate functions, +# "loss" and "grads". This is done because scipy.optimize +# requires separate functions for loss and gradients, +# but computing them separately would be inefficient. +Evaluator <- R6Class( + "Evaluator", + public = list( + + loss_value = NULL, + grad_values = NULL, + + initialize = function() { + self$loss_value <- NULL + self$grad_values <- NULL + }, + + loss = function(x){ + loss_and_grad <- eval_loss_and_grads(x) + self$loss_value <- loss_and_grad$loss_value + self$grad_values <- loss_and_grad$grad_values + self$loss_value + }, + + grads = function(x){ + grad_values <- self$grad_values + self$loss_value <- NULL + self$grad_values <- NULL + grad_values + } + + ) +) + +evaluator <- Evaluator$new() + +# run scipy-based optimization (L-BFGS) over the pixels of the generated image +# so as to minimize the neural style loss +dms <- c(1, img_nrows, img_ncols, 3) +x <- array(data = runif(prod(dms), min = 0, max = 255) - 128, dim = dms) + +# Run optimization (L-BFGS) over the pixels of the generated image +# so as to minimize the loss +for(i in 1:iterations){ + + # Run L-BFGS + opt <- optim( + as.numeric(x), fn = evaluator$loss, gr = evaluator$grads, + method = "L-BFGS-B", + control = list(maxit = 15) + ) + + # Print loss value + print(opt$value) + + # decode the image + image <- x <- opt$par + dim(image) <- dms + + # plot + im <- deprocess_image(image) + plot(as.raster(im)) + +} diff --git a/website/articles/examples/neural_style_transfer.html b/website/articles/examples/neural_style_transfer.html new file mode 100644 index 000000000..a2ae2fbb1 --- /dev/null +++ b/website/articles/examples/neural_style_transfer.html @@ -0,0 +1,372 @@ + + + + + + + +neural_style_transfer • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Neural style transfer with Keras.

+

It is preferable to run this script on a GPU, for speed.

+

Example result: https://twitter.com/fchollet/status/686631033085677568

+

Style transfer consists in generating an image with the same “content” as a base image, but with the “style” of a different picture (typically artistic).

+

This is achieved through the optimization of a loss function that has 3 components: “style loss”, “content loss”, and “total variation loss”:

+
    +
  • The total variation loss imposes local spatial continuity between the pixels of the combination image, giving it visual coherence.

  • +
  • The style loss is where the deep learning keeps in –that one is defined using a deep convolutional neural network. Precisely, it consists in a sum of L2 distances between the Gram matrices of the representations of the base image and the style reference image, extracted from different layers of a convnet (trained on ImageNet). The general idea is to capture color/texture information at different spatial scales (fairly large scales –defined by the depth of the layer considered).

  • +
  • The content loss is a L2 distance between the features of the base image (extracted from a deep layer) and the features of the combination image, keeping the generated image close enough to the original one.

  • +
+
library(keras)
+library(purrr)
+library(R6)
+K <- backend()
+
+# Parameters --------------------------------------------------------------
+
+base_image_path <- "neural-style-base-img.png"
+style_reference_image_path <- "neural-style-style.jpg"
+iterations <- 10
+
+# these are the weights of the different loss components
+total_variation_weight <- 1
+style_weight <- 1
+content_weight <- 0.025
+
+# dimensions of the generated picture.
+img <- image_load(base_image_path)
+width <- img$size[[1]]
+height <- img$size[[2]]
+img_nrows <- 400
+img_ncols <- as.integer(width * img_nrows / height)
+
+
+# Functions ---------------------------------------------------------------
+
+# util function to open, resize and format pictures into appropriate tensors
+preprocess_image <- function(path){
+  img <- image_load(path, target_size = c(img_nrows, img_ncols)) %>%
+    image_to_array()
+  dim(img) <- c(1, dim(img))
+  imagenet_preprocess_input(img)
+}
+
+# util function to convert a tensor into a valid image
+deprocess_image <- function(x){
+  x <- x[1,,,]
+  # Remove zero-center by mean pixel
+  x[,,1] <- x[,,1] + 103.939
+  x[,,2] <- x[,,2] + 116.779
+  x[,,3] <- x[,,3] + 123.68
+  # clip to interval 0, 255
+  x[x > 255] <- 255
+  x[x < 0] <- 0
+  x[] <- as.integer(x)/255
+  x
+}
+
+# Defining the model ------------------------------------------------------
+
+# get tensor representations of our images
+base_image <- K$variable(preprocess_image(base_image_path))
+style_reference_image <- K$variable(preprocess_image(style_reference_image_path))
+
+# this will contain our generated image
+combination_image <- K$placeholder(c(1L, img_nrows, img_ncols, 3L))
+
+# combine the 3 images into a single Keras tensor
+input_tensor <- K$concatenate(list(base_image, style_reference_image, 
+                                   combination_image), axis = 0L)
+
+# build the VGG16 network with our 3 images as input
+# the model will be loaded with pre-trained ImageNet weights
+model <- application_vgg16(input_tensor = input_tensor, weights = "imagenet", 
+                           include_top = FALSE)
+
+print("Model loaded.")
+
+nms <- map_chr(model$layers, ~.x$name)
+output_dict <- map(model$layers, ~.x$output) %>% set_names(nms)
+
+# compute the neural style loss
+# first we need to define 4 util functions
+
+# the gram matrix of an image tensor (feature-wise outer product)
+
+gram_matrix <- function(x){
+  
+  features <- x %>%
+    K$permute_dimensions(pattern = c(2L, 0L, 1L)) %>%
+    K$batch_flatten()
+  
+  K$dot(features, K$transpose(features))
+}
+
+# the "style loss" is designed to maintain
+# the style of the reference image in the generated image.
+# It is based on the gram matrices (which capture style) of
+# feature maps from the style reference image
+# and from the generated image
+
+style_loss <- function(style, combination){
+  S <- gram_matrix(style)
+  C <- gram_matrix(combination)
+  
+  channels <- 3
+  size <- img_nrows*img_ncols
+  
+  K$sum(K$square(S - C)) / (4 * channels^2  * size^2)
+}
+
+# an auxiliary loss function
+# designed to maintain the "content" of the
+# base image in the generated image
+
+content_loss <- function(base, combination){
+  K$sum(K$square(combination - base))
+}
+
+# the 3rd loss function, total variation loss,
+# designed to keep the generated image locally coherent
+
+total_variation_loss <- function(x){
+  y_ij  <- x[,0:(img_nrows - 2L), 0:(img_ncols - 2L),]
+  y_i1j <- x[,1:(img_nrows - 1L), 0:(img_ncols - 2L),]
+  y_ij1 <- x[,0:(img_nrows - 2L), 1:(img_ncols - 1L),]
+  
+  a <- K$square(y_ij - y_i1j)
+  b <- K$square(y_ij - y_ij1)
+  K$sum(K$pow(a + b, 1.25))
+}
+
+# combine these loss functions into a single scalar
+loss <- K$variable(0.0)
+layer_features <- output_dict$block4_conv2
+base_image_features <- layer_features[0,,,]
+combination_features <- layer_features[2,,,]
+
+loss <- loss + content_weight*content_loss(base_image_features, 
+                                           combination_features)
+
+feature_layers = c('block1_conv1', 'block2_conv1',
+                  'block3_conv1', 'block4_conv1',
+                  'block5_conv1')
+
+for(layer_name in feature_layers){
+  layer_features <- output_dict[[layer_name]]
+  style_reference_features <- layer_features[1,,,]
+  combination_features <- layer_features[2,,,]
+  sl <- style_loss(style_reference_features, combination_features)
+  loss <- loss + ((style_weight / length(feature_layers)) * sl)
+}
+
+loss <- loss + (total_variation_weight * total_variation_loss(combination_image))
+
+# get the gradients of the generated image wrt the loss
+grads <- K$gradients(loss, combination_image)[[1]]
+
+f_outputs <-  K$`function`(list(combination_image), list(loss, grads))
+
+eval_loss_and_grads <- function(image){
+  dim(image) <- c(1, img_nrows, img_ncols, 3)
+  outs <- f_outputs(list(image))
+  list(
+    loss_value = outs[[1]],
+    grad_values = as.numeric(outs[[2]])
+  )
+}
+
+# Loss and gradients evaluator.
+# 
+# This Evaluator class makes it possible
+# to compute loss and gradients in one pass
+# while retrieving them via two separate functions,
+# "loss" and "grads". This is done because scipy.optimize
+# requires separate functions for loss and gradients,
+# but computing them separately would be inefficient.
+Evaluator <- R6Class(
+  "Evaluator",
+  public = list(
+    
+    loss_value = NULL,
+    grad_values = NULL,
+    
+    initialize = function() {
+      self$loss_value <- NULL
+      self$grad_values <- NULL
+    },
+    
+    loss = function(x){
+      loss_and_grad <- eval_loss_and_grads(x)
+      self$loss_value <- loss_and_grad$loss_value
+      self$grad_values <- loss_and_grad$grad_values
+      self$loss_value
+    },
+    
+    grads = function(x){
+      grad_values <- self$grad_values
+      self$loss_value <- NULL
+      self$grad_values <- NULL
+      grad_values
+    }
+    
+  )
+)
+
+evaluator <- Evaluator$new()
+
+# run scipy-based optimization (L-BFGS) over the pixels of the generated image
+# so as to minimize the neural style loss
+dms <- c(1, img_nrows, img_ncols, 3)
+x <- array(data = runif(prod(dms), min = 0, max = 255) - 128, dim = dms)
+
+# Run optimization (L-BFGS) over the pixels of the generated image
+# so as to minimize the loss
+for(i in 1:iterations){
+
+  # Run L-BFGS
+  opt <- optim(
+    as.numeric(x), fn = evaluator$loss, gr = evaluator$grads, 
+    method = "L-BFGS-B",
+    control = list(maxit = 15)
+  )
+  
+  # Print loss value
+  print(opt$value)
+  
+  # decode the image
+  image <- x <- opt$par
+  dim(image) <- dms
+  
+  # plot
+  im <- deprocess_image(image)
+  plot(as.raster(im))
+  
+}
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/nueral_doodle.R b/website/articles/examples/nueral_doodle.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/nueral_doodle.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/nueral_doodle.html b/website/articles/examples/nueral_doodle.html new file mode 100644 index 000000000..f054c2476 --- /dev/null +++ b/website/articles/examples/nueral_doodle.html @@ -0,0 +1,137 @@ + + + + + + + +nueral_doodle • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/pretrained_word_embeddings.R b/website/articles/examples/pretrained_word_embeddings.R new file mode 100644 index 000000000..6217dfc15 --- /dev/null +++ b/website/articles/examples/pretrained_word_embeddings.R @@ -0,0 +1,185 @@ +#' This script loads pre-trained word embeddings (GloVe embeddings) into a +#' frozen Keras Embedding layer, and uses it to train a text classification +#' model on the 20 Newsgroup dataset (classication of newsgroup messages into 20 +#' different categories). +#' +#' GloVe embedding data can be found at: +#' http://nlp.stanford.edu/data/glove.6B.zip (source page: +#' http://nlp.stanford.edu/projects/glove/) +#' +#' 20 Newsgroup data can be found at: +#' http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html +#' + +#' +#' IMPORTANT NOTE: This example does yet work correctly. The code executes fine and +#' appears to mimic the Python code upon which it is based however it achieves only +#' half the training accuracy that the Python code does so there is clearly a +#' subtle difference. +#' +#' We need to investigate this further before formally adding to the list of examples +#' +#' + +library(keras) + +GLOVE_DIR <- 'glove.6B' +TEXT_DATA_DIR <- '20_newsgroup' +MAX_SEQUENCE_LENGTH <- 1000 +MAX_NB_WORDS <- 20000 +EMBEDDING_DIM <- 100 +VALIDATION_SPLIT <- 0.2 + +# download data if necessary +download_data <- function(data_dir, url_path, data_file) { + if (!dir.exists(data_dir)) { + download.file(paste0(url_path, data_file), data_file, mode = "wb") + if (tools::file_ext(data_file) == "zip") + unzip(data_file, exdir = tools::file_path_sans_ext(data_file)) + else + untar(data_file) + unlink(data_file) + } +} +download_data(GLOVE_DIR, 'http://nlp.stanford.edu/data/', 'glove.6B.zip') +download_data(TEXT_DATA_DIR, "http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/", "news20.tar.gz") + +# first, build index mapping words in the embeddings set +# to their embedding vector + +cat('Indexing word vectors.\n') + +embeddings_index <- new.env(parent = emptyenv()) +lines <- readLines(file.path(GLOVE_DIR, 'glove.6B.100d.txt')) +for (line in lines) { + values <- strsplit(line, ' ', fixed = TRUE)[[1]] + word <- values[[1]] + coefs <- as.numeric(values[-1]) + embeddings_index[[word]] <- coefs +} + +cat(sprintf('Found %s word vectors.\n', length(embeddings_index))) + +# second, prepare text samples and their labels +cat('Processing text dataset\n') + +texts <- character() # text samples +labels <- integer() # label ids +labels_index <- list() # dictionary: label name to numeric id + +for (name in list.files(TEXT_DATA_DIR)) { + path <- file.path(TEXT_DATA_DIR, name) + if (file_test("-d", path)) { + label_id <- length(labels_index) + labels_index[[name]] <- label_id + for (fname in list.files(path)) { + if (grepl("^[0-9]+$", fname)) { + fpath <- file.path(path, fname) + t <- readLines(fpath, encoding = "latin1") + t <- paste(t, collapse = "\n") + i <- regexpr(pattern = '\n\n', t, fixed = TRUE)[[1]] + if (i != -1L) + t <- substring(t, i) + texts <- c(texts, t) + labels <- c(labels, label_id) + } + } + } +} + +cat(sprintf('Found %s texts.\n', length(texts))) + +# finally, vectorize the text samples into a 2D integer tensor +tokenizer <- text_tokenizer(num_words=MAX_NB_WORDS) +tokenizer %>% fit_text_tokenizer(texts) + +sequences <- texts_to_sequences(tokenizer, texts) + +word_index <- tokenizer$word_index +cat(sprintf('Found %s unique tokens.\n', length(word_index))) + +data <- pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH) +labels <- to_categorical(labels) + +cat('Shape of data tensor: ', dim(data), '\n') +cat('Shape of label tensor: ', dim(labels), '\n') + +# split the data into a training set and a validation set +indices <- 1:nrow(data) +indices <- sample(indices) +data <- data[indices,] +labels <- labels[indices,] +num_validation_samples <- as.integer(VALIDATION_SPLIT * nrow(data)) + +x_train <- data[-(1:num_validation_samples),] +y_train <- labels[-(1:num_validation_samples),] +x_val <- data[1:num_validation_samples,] +y_val <- labels[1:num_validation_samples,] + +cat('Preparing embedding matrix.\n') + +# prepare embedding matrix +num_words <- min(MAX_NB_WORDS, length(word_index)) +prepare_embedding_matrix <- function() { + embedding_matrix <- matrix(0L, nrow = num_words, ncol = EMBEDDING_DIM) + for (word in names(word_index)) { + index <- word_index[[word]] + if (index >= MAX_NB_WORDS) + next + embedding_vector <- embeddings_index[[word]] + if (!is.null(embedding_vector)) { + # words not found in embedding index will be all-zeros. + embedding_matrix[index,] <- embedding_vector + } + } + embedding_matrix +} + +embedding_matrix <- prepare_embedding_matrix() + +# load pre-trained word embeddings into an Embedding layer +# note that we set trainable = False so as to keep the embeddings fixed +embedding_layer <- layer_embedding( + input_dim = num_words, + output_dim = EMBEDDING_DIM, + weights = list(embedding_matrix), + input_length = MAX_SEQUENCE_LENGTH, + trainable = FALSE +) + +cat('Training model\n') + +# train a 1D convnet with global maxpooling +sequence_input <- layer_input(shape = list(MAX_SEQUENCE_LENGTH), dtype='int32') + +preds <- sequence_input %>% + embedding_layer %>% + layer_conv_1d(filters = 128, kernel_size = 5, activation = 'relu') %>% + layer_max_pooling_1d(pool_size = 5) %>% + layer_conv_1d(filters = 128, kernel_size = 5, activation = 'relu') %>% + layer_max_pooling_1d(pool_size = 5) %>% + layer_conv_1d(filters = 128, kernel_size = 5, activation = 'relu') %>% + layer_max_pooling_1d(pool_size = 35) %>% + layer_flatten() %>% + layer_dense(units = 128, activation = 'relu') %>% + layer_dense(units = length(labels_index), activation = 'softmax') + + +model <- keras_model(sequence_input, preds) + +model %>% compile( + loss = 'categorical_crossentropy', + optimizer = 'rmsprop', + metrics = c('acc') +) + +model %>% fit( + x_train, y_train, + batch_size = 128, + epochs = 10, + validation_data = list(x_val, y_val) +) + + + + diff --git a/website/articles/examples/pretrained_word_embeddings.html b/website/articles/examples/pretrained_word_embeddings.html new file mode 100644 index 000000000..614a6f967 --- /dev/null +++ b/website/articles/examples/pretrained_word_embeddings.html @@ -0,0 +1,299 @@ + + + + + + + +pretrained_word_embeddings • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This script loads pre-trained word embeddings (GloVe embeddings) into a frozen Keras Embedding layer, and uses it to train a text classification model on the 20 Newsgroup dataset (classication of newsgroup messages into 20 different categories).

+

GloVe embedding data can be found at: http://nlp.stanford.edu/data/glove.6B.zip (source page: http://nlp.stanford.edu/projects/glove/)

+

20 Newsgroup data can be found at: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html

+

IMPORTANT NOTE: This example does yet work correctly. The code executes fine and appears to mimic the Python code upon which it is based however it achieves only half the training accuracy that the Python code does so there is clearly a subtle difference.

+

We need to investigate this further before formally adding to the list of examples

+
library(keras)
+
+GLOVE_DIR <- 'glove.6B'
+TEXT_DATA_DIR <- '20_newsgroup'
+MAX_SEQUENCE_LENGTH <- 1000
+MAX_NB_WORDS <- 20000
+EMBEDDING_DIM <- 100
+VALIDATION_SPLIT <- 0.2
+
+# download data if necessary
+download_data <- function(data_dir, url_path, data_file) {
+  if (!dir.exists(data_dir)) {
+    download.file(paste0(url_path, data_file), data_file, mode = "wb")
+    if (tools::file_ext(data_file) == "zip")
+      unzip(data_file, exdir = tools::file_path_sans_ext(data_file))
+    else
+      untar(data_file)
+    unlink(data_file)
+  }
+}
+download_data(GLOVE_DIR, 'http://nlp.stanford.edu/data/', 'glove.6B.zip')
+download_data(TEXT_DATA_DIR, "http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/", "news20.tar.gz")
+
+# first, build index mapping words in the embeddings set
+# to their embedding vector
+
+cat('Indexing word vectors.\n')
+
+embeddings_index <- new.env(parent = emptyenv())
+lines <- readLines(file.path(GLOVE_DIR, 'glove.6B.100d.txt'))
+for (line in lines) {
+  values <- strsplit(line, ' ', fixed = TRUE)[[1]]
+  word <- values[[1]]
+  coefs <- as.numeric(values[-1])
+  embeddings_index[[word]] <- coefs
+}
+
+cat(sprintf('Found %s word vectors.\n', length(embeddings_index)))
+
+# second, prepare text samples and their labels
+cat('Processing text dataset\n')
+
+texts <- character()  # text samples
+labels <- integer() # label ids
+labels_index <- list()  # dictionary: label name to numeric id
+
+for (name in list.files(TEXT_DATA_DIR)) {
+  path <- file.path(TEXT_DATA_DIR, name)
+  if (file_test("-d", path)) {
+    label_id <- length(labels_index)
+    labels_index[[name]] <- label_id
+    for (fname in list.files(path)) {
+      if (grepl("^[0-9]+$", fname)) {
+        fpath <- file.path(path, fname)
+        t <- readLines(fpath, encoding = "latin1")
+        t <- paste(t, collapse = "\n")
+        i <- regexpr(pattern = '\n\n', t, fixed = TRUE)[[1]]
+        if (i != -1L)
+          t <- substring(t, i)
+        texts <- c(texts, t)
+        labels <- c(labels, label_id)
+      }
+    }
+  }
+}
+
+cat(sprintf('Found %s texts.\n', length(texts)))
+
+# finally, vectorize the text samples into a 2D integer tensor
+tokenizer <- text_tokenizer(num_words=MAX_NB_WORDS)
+tokenizer %>% fit_text_tokenizer(texts)
+
+sequences <- texts_to_sequences(tokenizer, texts)
+
+word_index <- tokenizer$word_index
+cat(sprintf('Found %s unique tokens.\n', length(word_index)))
+
+data <- pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
+labels <- to_categorical(labels)
+
+cat('Shape of data tensor: ', dim(data), '\n')
+cat('Shape of label tensor: ', dim(labels), '\n')
+
+# split the data into a training set and a validation set
+indices <- 1:nrow(data)
+indices <- sample(indices)
+data <- data[indices,]
+labels <- labels[indices,]
+num_validation_samples <- as.integer(VALIDATION_SPLIT * nrow(data))
+
+x_train <- data[-(1:num_validation_samples),]
+y_train <- labels[-(1:num_validation_samples),]
+x_val <- data[1:num_validation_samples,]
+y_val <- labels[1:num_validation_samples,]
+
+cat('Preparing embedding matrix.\n')
+
+# prepare embedding matrix
+num_words <- min(MAX_NB_WORDS, length(word_index))
+prepare_embedding_matrix <- function() {
+  embedding_matrix <- matrix(0L, nrow = num_words, ncol = EMBEDDING_DIM)
+  for (word in names(word_index)) {
+    index <- word_index[[word]]
+    if (index >= MAX_NB_WORDS)
+      next
+    embedding_vector <- embeddings_index[[word]]
+    if (!is.null(embedding_vector)) {
+      # words not found in embedding index will be all-zeros.
+      embedding_matrix[index,] <- embedding_vector
+    }
+  }
+  embedding_matrix
+}
+
+embedding_matrix <- prepare_embedding_matrix()
+
+# load pre-trained word embeddings into an Embedding layer
+# note that we set trainable = False so as to keep the embeddings fixed
+embedding_layer <- layer_embedding(
+  input_dim = num_words,
+  output_dim = EMBEDDING_DIM,
+  weights = list(embedding_matrix),
+  input_length = MAX_SEQUENCE_LENGTH,
+  trainable = FALSE
+)
+                           
+cat('Training model\n')
+
+# train a 1D convnet with global maxpooling
+sequence_input <- layer_input(shape = list(MAX_SEQUENCE_LENGTH), dtype='int32')
+
+preds <- sequence_input %>%
+  embedding_layer %>% 
+  layer_conv_1d(filters = 128, kernel_size = 5, activation = 'relu') %>% 
+  layer_max_pooling_1d(pool_size = 5) %>% 
+  layer_conv_1d(filters = 128, kernel_size = 5, activation = 'relu') %>% 
+  layer_max_pooling_1d(pool_size = 5) %>% 
+  layer_conv_1d(filters = 128, kernel_size = 5, activation = 'relu') %>% 
+  layer_max_pooling_1d(pool_size = 35) %>% 
+  layer_flatten() %>% 
+  layer_dense(units = 128, activation = 'relu') %>% 
+  layer_dense(units = length(labels_index), activation = 'softmax')
+
+
+model <- keras_model(sequence_input, preds)
+
+model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = 'rmsprop',
+  metrics = c('acc')  
+)
+
+model %>% fit(
+  x_train, y_train,
+  batch_size = 128,
+  epochs = 10,
+  validation_data = list(x_val, y_val)
+)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/pretrained_word_embeddings.py b/website/articles/examples/pretrained_word_embeddings.py new file mode 100755 index 000000000..30a25a7fa --- /dev/null +++ b/website/articles/examples/pretrained_word_embeddings.py @@ -0,0 +1,150 @@ +'''This script loads pre-trained word embeddings (GloVe embeddings) +into a frozen Keras Embedding layer, and uses it to +train a text classification model on the 20 Newsgroup dataset +(classication of newsgroup messages into 20 different categories). + +GloVe embedding data can be found at: +http://nlp.stanford.edu/data/glove.6B.zip +(source page: http://nlp.stanford.edu/projects/glove/) + +20 Newsgroup data can be found at: +http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html +''' + +from __future__ import print_function + +import os +import sys +import numpy as np +import tensorflow.contrib.keras.api.keras as keras +from tensorflow.contrib.keras.api.keras.preprocessing.text import Tokenizer +from tensorflow.contrib.keras.api.keras.preprocessing.sequence import pad_sequences +from tensorflow.contrib.keras.api.keras.utils import to_categorical +from tensorflow.contrib.keras.api.keras.layers import Dense, Input, Flatten +from tensorflow.contrib.keras.api.keras.layers import Conv1D, MaxPooling1D, Embedding +from tensorflow.contrib.keras.api.keras.models import Model + + +BASE_DIR = '.' +GLOVE_DIR = BASE_DIR + '/glove.6B/' +TEXT_DATA_DIR = BASE_DIR + '/20_newsgroup/' +MAX_SEQUENCE_LENGTH = 1000 +MAX_NB_WORDS = 20000 +EMBEDDING_DIM = 100 +VALIDATION_SPLIT = 0.2 + +# first, build index mapping words in the embeddings set +# to their embedding vector + +print('Indexing word vectors.') + +embeddings_index = {} +f = open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt')) +for line in f: + values = line.split() + word = values[0] + coefs = np.asarray(values[1:], dtype='float32') + embeddings_index[word] = coefs +f.close() + +print('Found %s word vectors.' % len(embeddings_index)) + +# second, prepare text samples and their labels +print('Processing text dataset') + +texts = [] # list of text samples +labels_index = {} # dictionary mapping label name to numeric id +labels = [] # list of label ids +for name in sorted(os.listdir(TEXT_DATA_DIR)): + path = os.path.join(TEXT_DATA_DIR, name) + if os.path.isdir(path): + label_id = len(labels_index) + labels_index[name] = label_id + for fname in sorted(os.listdir(path)): + if fname.isdigit(): + fpath = os.path.join(path, fname) + if sys.version_info < (3,): + f = open(fpath) + else: + f = open(fpath, encoding='latin-1') + t = f.read() + i = t.find('\n\n') # skip header + if 0 < i: + t = t[i:] + texts.append(t) + f.close() + labels.append(label_id) + +print('Found %s texts.' % len(texts)) + +# finally, vectorize the text samples into a 2D integer tensor +tokenizer = Tokenizer(num_words=MAX_NB_WORDS) +tokenizer.fit_on_texts(texts) +sequences = tokenizer.texts_to_sequences(texts) + +word_index = tokenizer.word_index +print('Found %s unique tokens.' % len(word_index)) + +data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH) + +labels = to_categorical(np.asarray(labels)) +print('Shape of data tensor:', data.shape) +print('Shape of label tensor:', labels.shape) + +# split the data into a training set and a validation set +indices = np.arange(data.shape[0]) +np.random.shuffle(indices) +data = data[indices] +labels = labels[indices] +num_validation_samples = int(VALIDATION_SPLIT * data.shape[0]) + +x_train = data[:-num_validation_samples] +y_train = labels[:-num_validation_samples] +x_val = data[-num_validation_samples:] +y_val = labels[-num_validation_samples:] + +print('Preparing embedding matrix.') + +# prepare embedding matrix +num_words = min(MAX_NB_WORDS, len(word_index)) +embedding_matrix = np.zeros((num_words, EMBEDDING_DIM)) +for word, i in word_index.items(): + if i >= MAX_NB_WORDS: + continue + embedding_vector = embeddings_index.get(word) + if embedding_vector is not None: + # words not found in embedding index will be all-zeros. + embedding_matrix[i] = embedding_vector + +# load pre-trained word embeddings into an Embedding layer +# note that we set trainable = False so as to keep the embeddings fixed +embedding_layer = Embedding(num_words, + EMBEDDING_DIM, + weights=[embedding_matrix], + input_length=MAX_SEQUENCE_LENGTH, + trainable=False) + +print('Training model.') + +# train a 1D convnet with global maxpooling +sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32') +embedded_sequences = embedding_layer(sequence_input) +x = Conv1D(128, 5, activation='relu')(embedded_sequences) +x = MaxPooling1D(5)(x) +x = Conv1D(128, 5, activation='relu')(x) +x = MaxPooling1D(5)(x) +x = Conv1D(128, 5, activation='relu')(x) +x = MaxPooling1D(35)(x) +x = Flatten()(x) +x = Dense(128, activation='relu')(x) +preds = Dense(len(labels_index), activation='softmax')(x) + +model = Model(sequence_input, preds) +model.compile(loss='categorical_crossentropy', + optimizer='rmsprop', + metrics=['acc']) + +model.fit(x_train, y_train, + batch_size=128, + epochs=10, + validation_data=(x_val, y_val)) diff --git a/website/articles/examples/reuters_mlp.R b/website/articles/examples/reuters_mlp.R new file mode 100644 index 000000000..cc1600e1d --- /dev/null +++ b/website/articles/examples/reuters_mlp.R @@ -0,0 +1,69 @@ +#' Train and evaluate a simple MLP on the Reuters newswire topic classification task. + +library(keras) + +max_words <- 1000 +batch_size <- 32 +epochs <- 5 + +cat('Loading data...\n') +reuters <- dataset_reuters(num_words = max_words, test_split = 0.2) +x_train <- reuters$train$x +y_train <- reuters$train$y +x_test <- reuters$test$x +y_test <- reuters$test$y + +cat(length(x_train), 'train sequences\n') +cat(length(x_test), 'test sequences\n') + +num_classes <- max(y_train) + 1 +cat(num_classes, '\n') + +cat('Vectorizing sequence data...\n') + +tokenizer <- text_tokenizer(num_words = max_words) +x_train <- sequences_to_matrix(tokenizer, x_train, mode = 'binary') +x_test <- sequences_to_matrix(tokenizer, x_test, mode = 'binary') + +cat('x_train shape:', dim(x_train), '\n') +cat('x_test shape:', dim(x_test), '\n') + +cat('Convert class vector to binary class matrix', + '(for use with categorical_crossentropy)\n') +y_train <- to_categorical(y_train, num_classes) +y_test <- to_categorical(y_test, num_classes) +cat('y_train shape:', dim(y_train), '\n') +cat('y_test shape:', dim(y_test), '\n') + +cat('Building model...\n') +model <- keras_model_sequential() +model %>% + layer_dense(units = 512, input_shape = c(max_words)) %>% + layer_activation(activation = 'relu') %>% + layer_dropout(rate = 0.5) %>% + layer_dense(units = num_classes) %>% + layer_activation(activation = 'softmax') + +model %>% compile( + loss = 'categorical_crossentropy', + optimizer = 'adam', + metrics = c('accuracy') +) + +history <- model %>% fit( + x_train, y_train, + batch_size = batch_size, + epochs = epochs, + verbose = 1, + validation_split = 0.1 +) + +score <- model %>% evaluate( + x_test, y_test, + batch_size = batch_size, + verbose = 1 +) + +cat('Test score:', score[[1]], '\n') +cat('Test accuracy', score[[2]], '\n') + diff --git a/website/articles/examples/reuters_mlp.html b/website/articles/examples/reuters_mlp.html new file mode 100644 index 000000000..2b1f106fb --- /dev/null +++ b/website/articles/examples/reuters_mlp.html @@ -0,0 +1,203 @@ + + + + + + + +reuters_mlp • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Train and evaluate a simple MLP on the Reuters newswire topic classification task.

+
library(keras)
+
+max_words <- 1000
+batch_size <- 32
+epochs <- 5
+
+cat('Loading data...\n')
+reuters <- dataset_reuters(num_words = max_words, test_split = 0.2)
+x_train <- reuters$train$x
+y_train <- reuters$train$y
+x_test <- reuters$test$x
+y_test <- reuters$test$y
+
+cat(length(x_train), 'train sequences\n')
+cat(length(x_test), 'test sequences\n')
+
+num_classes <- max(y_train) + 1
+cat(num_classes, '\n')
+
+cat('Vectorizing sequence data...\n')
+
+tokenizer <- text_tokenizer(num_words = max_words)
+x_train <- sequences_to_matrix(tokenizer, x_train, mode = 'binary')
+x_test <- sequences_to_matrix(tokenizer, x_test, mode = 'binary')
+
+cat('x_train shape:', dim(x_train), '\n')
+cat('x_test shape:', dim(x_test), '\n')
+
+cat('Convert class vector to binary class matrix',
+    '(for use with categorical_crossentropy)\n')
+y_train <- to_categorical(y_train, num_classes)
+y_test <- to_categorical(y_test, num_classes)
+cat('y_train shape:', dim(y_train), '\n')
+cat('y_test shape:', dim(y_test), '\n')
+
+cat('Building model...\n')
+model <- keras_model_sequential()
+model %>%
+  layer_dense(units = 512, input_shape = c(max_words)) %>% 
+  layer_activation(activation = 'relu') %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = num_classes) %>% 
+  layer_activation(activation = 'softmax')
+
+model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = 'adam',
+  metrics = c('accuracy')
+)
+
+history <- model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  validation_split = 0.1
+)
+
+score <- model %>% evaluate(
+  x_test, y_test,
+  batch_size = batch_size,
+  verbose = 1
+)
+
+cat('Test score:', score[[1]], '\n')
+cat('Test accuracy', score[[2]], '\n')
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/reuters_mlp_relu_vs_selu.R b/website/articles/examples/reuters_mlp_relu_vs_selu.R new file mode 100644 index 000000000..0ab13681d --- /dev/null +++ b/website/articles/examples/reuters_mlp_relu_vs_selu.R @@ -0,0 +1 @@ +library(keras) diff --git a/website/articles/examples/reuters_mlp_relu_vs_selu.html b/website/articles/examples/reuters_mlp_relu_vs_selu.html new file mode 100644 index 000000000..0538eca0d --- /dev/null +++ b/website/articles/examples/reuters_mlp_relu_vs_selu.html @@ -0,0 +1,137 @@ + + + + + + + +reuters_mlp_relu_vs_selu • keras + + + + + + + +
+
+ + + +
+
+ + + + + +
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/reuters_mlp_relu_vs_selu.py b/website/articles/examples/reuters_mlp_relu_vs_selu.py new file mode 100644 index 000000000..8c1fe9662 --- /dev/null +++ b/website/articles/examples/reuters_mlp_relu_vs_selu.py @@ -0,0 +1,174 @@ +'''Compares self-normalizing MLPs with regular MLPs. + +Compares the performance of a simple MLP using two +different activation functions: RELU and SELU +on the Reuters newswire topic classification task. + +# Reference: + Klambauer, G., Unterthiner, T., Mayr, A., & Hochreiter, S. (2017). + Self-Normalizing Neural Networks. arXiv preprint arXiv:1706.02515. + https://arxiv.org/abs/1706.02515 +''' +from __future__ import print_function + +import numpy as np +import matplotlib.pyplot as plt +import keras +from keras.datasets import reuters +from keras.models import Sequential +from keras.layers import Dense, Activation, Dropout +from keras.layers.noise import AlphaDropout +from keras.preprocessing.text import Tokenizer + +max_words = 1000 +batch_size = 16 +epochs = 40 +plot = True + + +def create_network(n_dense=6, + dense_units=16, + activation='selu', + dropout=AlphaDropout, + dropout_rate=0.1, + kernel_initializer='lecun_normal', + optimizer='adam', + num_classes=1, + max_words=max_words): + """Generic function to create a fully-connected neural network. + + # Arguments + n_dense: int > 0. Number of dense layers. + dense_units: int > 0. Number of dense units per layer. + dropout: keras.layers.Layer. A dropout layer to apply. + dropout_rate: 0 <= float <= 1. The rate of dropout. + kernel_initializer: str. The initializer for the weights. + optimizer: str/keras.optimizers.Optimizer. The optimizer to use. + num_classes: int > 0. The number of classes to predict. + max_words: int > 0. The maximum number of words per data point. + + # Returns + A Keras model instance (compiled). + """ + model = Sequential() + model.add(Dense(dense_units, input_shape=(max_words,), + kernel_initializer=kernel_initializer)) + model.add(Activation(activation)) + model.add(dropout(dropout_rate)) + + for i in range(n_dense - 1): + model.add(Dense(dense_units, kernel_initializer=kernel_initializer)) + model.add(Activation(activation)) + model.add(dropout(dropout_rate)) + + model.add(Dense(num_classes)) + model.add(Activation('softmax')) + model.compile(loss='categorical_crossentropy', + optimizer=optimizer, + metrics=['accuracy']) + return model + + +network1 = { + 'n_dense': 6, + 'dense_units': 16, + 'activation': 'relu', + 'dropout': Dropout, + 'dropout_rate': 0.5, + 'kernel_initializer': 'glorot_uniform', + 'optimizer': 'sgd' +} + +network2 = { + 'n_dense': 6, + 'dense_units': 16, + 'activation': 'selu', + 'dropout': AlphaDropout, + 'dropout_rate': 0.1, + 'kernel_initializer': 'lecun_normal', + 'optimizer': 'sgd' +} + +print('Loading data...') +(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=max_words, + test_split=0.2) +print(len(x_train), 'train sequences') +print(len(x_test), 'test sequences') + +num_classes = np.max(y_train) + 1 +print(num_classes, 'classes') + +print('Vectorizing sequence data...') +tokenizer = Tokenizer(num_words=max_words) +x_train = tokenizer.sequences_to_matrix(x_train, mode='binary') +x_test = tokenizer.sequences_to_matrix(x_test, mode='binary') +print('x_train shape:', x_train.shape) +print('x_test shape:', x_test.shape) + +print('Convert class vector to binary class matrix ' + '(for use with categorical_crossentropy)') +y_train = keras.utils.to_categorical(y_train, num_classes) +y_test = keras.utils.to_categorical(y_test, num_classes) +print('y_train shape:', y_train.shape) +print('y_test shape:', y_test.shape) + +print('\nBuilding network 1...') + +model1 = create_network(num_classes=num_classes, **network1) +history_model1 = model1.fit(x_train, + y_train, + batch_size=batch_size, + epochs=epochs, + verbose=1, + validation_split=0.1) + +score_model1 = model1.evaluate(x_test, + y_test, + batch_size=batch_size, + verbose=1) + + +print('\nBuilding network 2...') +model2 = create_network(num_classes=num_classes, **network2) + +history_model2 = model2.fit(x_train, + y_train, + batch_size=batch_size, + epochs=epochs, + verbose=1, + validation_split=0.1) + +score_model2 = model2.evaluate(x_test, + y_test, + batch_size=batch_size, + verbose=1) + +print('\nNetwork 1 results') +print('Hyperparameters:', network1) +print('Test score:', score_model1[0]) +print('Test accuracy:', score_model1[1]) +print('Network 2 results') +print('Hyperparameters:', network2) +print('Test score:', score_model2[0]) +print('Test accuracy:', score_model2[1]) + +plt.plot(range(epochs), + history_model1.history['val_loss'], + 'g-', + label='Network 1 Val Loss') +plt.plot(range(epochs), + history_model2.history['val_loss'], + 'r-', + label='Network 2 Val Loss') +plt.plot(range(epochs), + history_model1.history['loss'], + 'g--', + label='Network 1 Loss') +plt.plot(range(epochs), + history_model2.history['loss'], + 'r--', + label='Network 2 Loss') +plt.xlabel('Epochs') +plt.ylabel('Loss') +plt.legend() +plt.savefig('comparison_of_networks.png') diff --git a/website/articles/examples/stateful_lstm.R b/website/articles/examples/stateful_lstm.R new file mode 100644 index 000000000..4f9b9d453 --- /dev/null +++ b/website/articles/examples/stateful_lstm.R @@ -0,0 +1,76 @@ +#' Example script showing how to use stateful RNNs to model long sequences +#' efficiently. +#' + +library(keras) + +# since we are using stateful rnn tsteps can be set to 1 +tsteps <- 1 +batch_size <- 25 +epochs <- 25 +# number of elements ahead that are used to make the prediction +lahead <- 1 + +# Generates an absolute cosine time series with the amplitude exponentially decreasing +# Arguments: +# amp: amplitude of the cosine function +# period: period of the cosine function +# x0: initial x of the time series +# xn: final x of the time series +# step: step of the time series discretization +# k: exponential rate +gen_cosine_amp <- function(amp = 100, period = 1000, x0 = 0, xn = 50000, step = 1, k = 0.0001) { + n <- (xn-x0) * step + cos <- array(data = numeric(n), dim = c(n, 1, 1)) + for (i in 1:length(cos)) { + idx <- x0 + i * step + cos[[i, 1, 1]] <- amp * cos(2 * pi * idx / period) + cos[[i, 1, 1]] <- cos[[i, 1, 1]] * exp(-k * idx) + } + cos +} + +cat('Generating Data...\n') +cos <- gen_cosine_amp() +cat('Input shape:', dim(cos), '\n') + +expected_output <- array(data = numeric(length(cos)), dim = c(length(cos), 1)) +for (i in 1:(length(cos) - lahead)) { + expected_output[[i, 1]] <- mean(cos[(i + 1):(i + lahead)]) +} + +cat('Output shape:', dim(expected_output), '\n') + +cat('Creating model:\n') +model <- keras_model_sequential() +model %>% + layer_lstm(units = 50, input_shape = c(tsteps, 1), batch_size = batch_size, + return_sequences = TRUE, stateful = TRUE) %>% + layer_lstm(units = 50, return_sequences = FALSE, stateful = TRUE) %>% + layer_dense(units = 1) +model %>% compile(loss = 'mse', optimizer = 'rmsprop') + +cat('Training\n') +for (i in 1:epochs) { + model %>% fit(cos, expected_output, batch_size = batch_size, + epochs = 1, verbose = 1, shuffle = FALSE) + + model %>% reset_states() +} + +cat('Predicting\n') +predicted_output <- model %>% predict(cos, batch_size = batch_size) + +cat('Plotting Results\n') +op <- par(mfrow=c(2,1)) +plot(expected_output, xlab = '') +title("Expected") +plot(predicted_output, xlab = '') +title("Predicted") +par(op) + + + + + + diff --git a/website/articles/examples/stateful_lstm.html b/website/articles/examples/stateful_lstm.html new file mode 100644 index 000000000..ed9153fb6 --- /dev/null +++ b/website/articles/examples/stateful_lstm.html @@ -0,0 +1,203 @@ + + + + + + + +stateful_lstm • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

Example script showing how to use stateful RNNs to model long sequences efficiently.

+
library(keras)
+
+# since we are using stateful rnn tsteps can be set to 1
+tsteps <- 1
+batch_size <- 25
+epochs <- 25
+# number of elements ahead that are used to make the prediction
+lahead <- 1
+
+# Generates an absolute cosine time series with the amplitude exponentially decreasing
+# Arguments:
+#   amp: amplitude of the cosine function
+#   period: period of the cosine function
+#   x0: initial x of the time series
+#   xn: final x of the time series
+#   step: step of the time series discretization
+#   k: exponential rate
+gen_cosine_amp <- function(amp = 100, period = 1000, x0 = 0, xn = 50000, step = 1, k = 0.0001) {
+  n <- (xn-x0) * step
+  cos <- array(data = numeric(n), dim = c(n, 1, 1))
+  for (i in 1:length(cos)) {
+    idx <- x0 + i * step
+    cos[[i, 1, 1]] <- amp * cos(2 * pi * idx / period)
+    cos[[i, 1, 1]] <- cos[[i, 1, 1]] * exp(-k * idx)
+  }
+  cos
+}
+
+cat('Generating Data...\n')
+cos <- gen_cosine_amp()
+cat('Input shape:', dim(cos), '\n')
+
+expected_output <- array(data = numeric(length(cos)), dim = c(length(cos), 1))
+for (i in 1:(length(cos) - lahead)) {
+  expected_output[[i, 1]] <- mean(cos[(i + 1):(i + lahead)])
+}
+
+cat('Output shape:', dim(expected_output), '\n')
+
+cat('Creating model:\n')
+model <- keras_model_sequential()
+model %>%
+  layer_lstm(units = 50, input_shape = c(tsteps, 1), batch_size = batch_size,
+             return_sequences = TRUE, stateful = TRUE) %>% 
+  layer_lstm(units = 50, return_sequences = FALSE, stateful = TRUE) %>% 
+  layer_dense(units = 1)
+model %>% compile(loss = 'mse', optimizer = 'rmsprop')
+
+cat('Training\n')
+for (i in 1:epochs) {
+  model %>% fit(cos, expected_output, batch_size = batch_size,
+                epochs = 1, verbose = 1, shuffle = FALSE)
+            
+  model %>% reset_states()
+}
+
+cat('Predicting\n')
+predicted_output <- model %>% predict(cos, batch_size = batch_size)
+
+cat('Plotting Results\n')
+op <- par(mfrow=c(2,1))
+plot(expected_output, xlab = '')
+title("Expected")
+plot(predicted_output, xlab = '')
+title("Predicted")
+par(op)
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/variational_autoencoder.R b/website/articles/examples/variational_autoencoder.R new file mode 100644 index 000000000..877bab713 --- /dev/null +++ b/website/articles/examples/variational_autoencoder.R @@ -0,0 +1,117 @@ +#' This script demonstrates how to build a variational autoencoder with Keras. +#' Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114 + +library(keras) +K <- keras::backend() + +# Parameters -------------------------------------------------------------- + +batch_size <- 100L +original_dim <- 784L +latent_dim <- 2L +intermediate_dim <- 256L +epochs <- 50L +epsilon_std <- 1.0 + +# Model definition -------------------------------------------------------- + +x <- layer_input(batch_shape = c(batch_size, original_dim)) +h <- layer_dense(x, intermediate_dim, activation = "relu") +z_mean <- layer_dense(h, latent_dim) +z_log_var <- layer_dense(h, latent_dim) + +sampling <- function(arg){ + z_mean <- arg[,0:1] + z_log_var <- arg[,2:3] + + epsilon <- K$random_normal( + shape = c(batch_size, latent_dim), + mean=0., + stddev=epsilon_std + ) + + z_mean + K$exp(z_log_var/2)*epsilon +} + +# note that "output_shape" isn't necessary with the TensorFlow backend +z <- layer_concatenate(list(z_mean, z_log_var)) %>% + layer_lambda(sampling) + +# we instantiate these layers separately so as to reuse them later +decoder_h <- layer_dense(units = intermediate_dim, activation = "relu") +decoder_mean <- layer_dense(units = original_dim, activation = "sigmoid") +h_decoded <- decoder_h(z) +x_decoded_mean <- decoder_mean(h_decoded) + +# end-to-end autoencoder +vae <- keras_model(x, x_decoded_mean) + +# encoder, from inputs to latent space +encoder <- keras_model(x, z_mean) + +# generator, from latent space to reconstructed inputs +decoder_input <- layer_input(shape = latent_dim) +h_decoded_2 <- decoder_h(decoder_input) +x_decoded_mean_2 <- decoder_mean(h_decoded_2) +generator <- keras_model(decoder_input, x_decoded_mean_2) + + +vae_loss <- function(x, x_decoded_mean){ + xent_loss <- (original_dim/1.0)*loss_binary_crossentropy(x, x_decoded_mean) + kl_loss <- -0.5*K$mean(1 + z_log_var - K$square(z_mean) - K$exp(z_log_var), axis = -1L) + xent_loss + kl_loss +} + +vae %>% compile(optimizer = "rmsprop", loss = vae_loss) + + +# Data preparation -------------------------------------------------------- + +mnist <- dataset_mnist() +x_train <- mnist$train$x/255 +x_test <- mnist$test$x/255 +x_train <- x_train %>% apply(1, as.numeric) %>% t() +x_test <- x_test %>% apply(1, as.numeric) %>% t() + + +# Model training ---------------------------------------------------------- + +vae %>% fit( + x_train, x_train, + shuffle = TRUE, + epochs = epochs, + batch_size = batch_size, + validation_data = list(x_test, x_test) +) + + +# Visualizations ---------------------------------------------------------- + +library(ggplot2) +library(dplyr) +x_test_encoded <- predict(encoder, x_test, batch_size = batch_size) + +x_test_encoded %>% + as_data_frame() %>% + mutate(class = as.factor(mnist$test$y)) %>% + ggplot(aes(x = V1, y = V2, colour = class)) + geom_point() + +# display a 2D manifold of the digits +n <- 15 # figure with 15x15 digits +digit_size <- 28 + +# we will sample n points within [-4, 4] standard deviations +grid_x <- seq(-4, 4, length.out = n) +grid_y <- seq(-4, 4, length.out = n) + +rows <- NULL +for(i in 1:length(grid_x)){ + column <- NULL + for(j in 1:length(grid_y)){ + z_sample <- matrix(c(grid_x[i], grid_y[j]), ncol = 2) + column <- rbind(column, predict(generator, z_sample) %>% matrix(ncol = 28) ) + } + rows <- cbind(rows, column) +} +rows %>% as.raster() %>% plot() + diff --git a/website/articles/examples/variational_autoencoder.html b/website/articles/examples/variational_autoencoder.html new file mode 100644 index 000000000..19c9fff48 --- /dev/null +++ b/website/articles/examples/variational_autoencoder.html @@ -0,0 +1,250 @@ + + + + + + + +variational_autoencoder • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This script demonstrates how to build a variational autoencoder with Keras. Reference: “Auto-Encoding Variational Bayes” https://arxiv.org/abs/1312.6114

+
library(keras)
+K <- keras::backend()
+
+# Parameters --------------------------------------------------------------
+
+batch_size <- 100L
+original_dim <- 784L
+latent_dim <- 2L
+intermediate_dim <- 256L
+epochs <- 50L
+epsilon_std <- 1.0
+
+# Model definition --------------------------------------------------------
+
+x <- layer_input(batch_shape = c(batch_size, original_dim))
+h <- layer_dense(x, intermediate_dim, activation = "relu")
+z_mean <- layer_dense(h, latent_dim)
+z_log_var <- layer_dense(h, latent_dim)
+
+sampling <- function(arg){
+  z_mean <- arg[,0:1]
+  z_log_var <- arg[,2:3]
+  
+  epsilon <- K$random_normal(
+    shape = c(batch_size, latent_dim), 
+    mean=0.,
+    stddev=epsilon_std
+  )
+  
+  z_mean + K$exp(z_log_var/2)*epsilon
+}
+
+# note that "output_shape" isn't necessary with the TensorFlow backend
+z <- layer_concatenate(list(z_mean, z_log_var)) %>% 
+  layer_lambda(sampling)
+
+# we instantiate these layers separately so as to reuse them later
+decoder_h <- layer_dense(units = intermediate_dim, activation = "relu")
+decoder_mean <- layer_dense(units = original_dim, activation = "sigmoid")
+h_decoded <- decoder_h(z)
+x_decoded_mean <- decoder_mean(h_decoded)
+
+# end-to-end autoencoder
+vae <- keras_model(x, x_decoded_mean)
+
+# encoder, from inputs to latent space
+encoder <- keras_model(x, z_mean)
+
+# generator, from latent space to reconstructed inputs
+decoder_input <- layer_input(shape = latent_dim)
+h_decoded_2 <- decoder_h(decoder_input)
+x_decoded_mean_2 <- decoder_mean(h_decoded_2)
+generator <- keras_model(decoder_input, x_decoded_mean_2)
+
+
+vae_loss <- function(x, x_decoded_mean){
+  xent_loss <- (original_dim/1.0)*loss_binary_crossentropy(x, x_decoded_mean)
+  kl_loss <- -0.5*K$mean(1 + z_log_var - K$square(z_mean) - K$exp(z_log_var), axis = -1L)
+  xent_loss + kl_loss
+}
+
+vae %>% compile(optimizer = "rmsprop", loss = vae_loss)
+
+
+# Data preparation --------------------------------------------------------
+
+mnist <- dataset_mnist()
+x_train <- mnist$train$x/255
+x_test <- mnist$test$x/255
+x_train <- x_train %>% apply(1, as.numeric) %>% t()
+x_test <- x_test %>% apply(1, as.numeric) %>% t()
+
+
+# Model training ----------------------------------------------------------
+
+vae %>% fit(
+  x_train, x_train, 
+  shuffle = TRUE, 
+  epochs = epochs, 
+  batch_size = batch_size, 
+  validation_data = list(x_test, x_test)
+)
+
+
+# Visualizations ----------------------------------------------------------
+
+library(ggplot2)
+library(dplyr)
+x_test_encoded <- predict(encoder, x_test, batch_size = batch_size)
+
+x_test_encoded %>%
+  as_data_frame() %>% 
+  mutate(class = as.factor(mnist$test$y)) %>%
+  ggplot(aes(x = V1, y = V2, colour = class)) + geom_point()
+
+# display a 2D manifold of the digits
+n <- 15  # figure with 15x15 digits
+digit_size <- 28
+
+# we will sample n points within [-4, 4] standard deviations
+grid_x <- seq(-4, 4, length.out = n)
+grid_y <- seq(-4, 4, length.out = n)
+
+rows <- NULL
+for(i in 1:length(grid_x)){
+  column <- NULL
+  for(j in 1:length(grid_y)){
+    z_sample <- matrix(c(grid_x[i], grid_y[j]), ncol = 2)
+    column <- rbind(column, predict(generator, z_sample) %>% matrix(ncol = 28) )
+  }
+  rows <- cbind(rows, column)
+}
+rows %>% as.raster() %>% plot()
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/examples/variational_autoencoder_deconv.R b/website/articles/examples/variational_autoencoder_deconv.R new file mode 100644 index 000000000..c7103cce5 --- /dev/null +++ b/website/articles/examples/variational_autoencoder_deconv.R @@ -0,0 +1,220 @@ +#' This script demonstrates how to build a variational autoencoder with Keras +#' and deconvolution layers. +#' Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114 + +library(keras) +K <- keras::backend() + +#### Parameterization #### + +# input image dimensions +img_rows <- 28L +img_cols <- 28L +# color channels (1 = grayscale, 3 = RGB) +img_chns <- 1L + +# number of convolutional filters to use +filters <- 64L + +# convolution kernel size +num_conv <- 3L + +latent_dim <- 2L +intermediate_dim <- 128L +epsilon_std <- 1.0 + +# training parameters +batch_size <- 100L +epochs <- 5L + + +#### Model Construction #### + +original_img_size <- c(img_rows, img_cols, img_chns) + +x <- layer_input(batch_shape = c(batch_size, original_img_size)) + +conv_1 <- layer_conv_2d( + x, + filters = img_chns, + kernel_size = c(2L, 2L), + strides = c(1L, 1L), + padding = "same", + activation = "relu" +) + +conv_2 <- layer_conv_2d( + conv_1, + filters = filters, + kernel_size = c(2L, 2L), + strides = c(2L, 2L), + padding = "same", + activation = "relu" +) + +conv_3 <- layer_conv_2d( + conv_2, + filters = filters, + kernel_size = c(num_conv, num_conv), + strides = c(1L, 1L), + padding = "same", + activation = "relu" +) + +conv_4 <- layer_conv_2d( + conv_3, + filters = filters, + kernel_size = c(num_conv, num_conv), + strides = c(1L, 1L), + padding = "same", + activation = "relu" +) + +flat <- layer_flatten(conv_4) +hidden <- layer_dense(flat, units = intermediate_dim, activation = "relu") + +z_mean <- layer_dense(hidden, units = latent_dim) +z_log_var <- layer_dense(hidden, units = latent_dim) + +sampling <- function(args) { + z_mean <- args[, 0:(latent_dim - 1)] + z_log_var <- args[, latent_dim:(2 * latent_dim - 1)] + + epsilon <- K$random_normal( + shape = c(batch_size, latent_dim), + mean = 0., + stddev = epsilon_std + ) + z_mean + K$exp(z_log_var) * epsilon +} + +z <- layer_concatenate(list(z_mean, z_log_var)) %>% layer_lambda(sampling) + +output_shape <- c(batch_size, 14L, 14L, filters) + +decoder_hidden <- layer_dense(units = intermediate_dim, activation = "relu") +decoder_upsample <- layer_dense(units = prod(output_shape[-1]), activation = "relu") + +decoder_reshape <- layer_reshape(target_shape = output_shape[-1]) +decoder_deconv_1 <- layer_conv_2d_transpose( + filters = filters, + kernel_size = c(num_conv, num_conv), + strides = c(1L, 1L), + padding = "same", + activation = "relu" +) + +decoder_deconv_2 <- layer_conv_2d_transpose( + filters = filters, + kernel_size = c(num_conv, num_conv), + strides = c(1L, 1L), + padding = "same", + activation = "relu" +) + +decoder_deconv_3_upsample <- layer_conv_2d_transpose( + filters = filters, + kernel_size = c(3L, 3L), + strides = c(2L, 2L), + padding = "valid", + activation = "relu" +) + +decoder_mean_squash <- layer_conv_2d( + filters = img_chns, + kernel_size = c(2L, 2L), + strides = c(1L, 1L), + padding = "valid", + activation = "sigmoid" +) + +hidden_decoded <- decoder_hidden(z) +up_decoded <- decoder_upsample(hidden_decoded) +reshape_decoded <- decoder_reshape(up_decoded) +deconv_1_decoded <- decoder_deconv_1(reshape_decoded) +deconv_2_decoded <- decoder_deconv_2(deconv_1_decoded) +x_decoded_relu <- decoder_deconv_3_upsample(deconv_2_decoded) +x_decoded_mean_squash <- decoder_mean_squash(x_decoded_relu) + +# custom loss function +vae_loss <- function(x, x_decoded_mean_squash) { + x <- K$flatten(x) + x_decoded_mean_squash <- K$flatten(x_decoded_mean_squash) + xent_loss <- 1.0 * img_rows * img_cols * + loss_binary_crossentropy(x, x_decoded_mean_squash) + kl_loss <- -0.5 * K$mean(1 + z_log_var - K$square(z_mean) - + K$exp(z_log_var), axis = -1L) + K$mean(xent_loss + kl_loss) +} + +## variational autoencoder +vae <- keras_model(x, x_decoded_mean_squash) +vae %>% compile(optimizer = "rmsprop", loss = vae_loss) +summary(vae) + +## encoder: model to project inputs on the latent space +encoder <- keras_model(x, z_mean) + +## build a digit generator that can sample from the learned distribution +gen_decoder_input <- layer_input(shape = latent_dim) +gen_hidden_decoded <- decoder_hidden(gen_decoder_input) +gen_up_decoded <- decoder_upsample(gen_hidden_decoded) +gen_reshape_decoded <- decoder_reshape(gen_up_decoded) +gen_deconv_1_decoded <- decoder_deconv_1(gen_reshape_decoded) +gen_deconv_2_decoded <- decoder_deconv_2(gen_deconv_1_decoded) +gen_x_decoded_relu <- decoder_deconv_3_upsample(gen_deconv_2_decoded) +gen_x_decoded_mean_squash <- decoder_mean_squash(gen_x_decoded_relu) +generator <- keras_model(gen_decoder_input, gen_x_decoded_mean_squash) + + +#### Data Preparation #### + +mnist <- dataset_mnist() +data <- lapply(mnist, function(m) { + array(m$x / 255, dim = c(dim(m$x)[1], original_img_size)) +}) +x_train <- data$train +x_test <- data$test + + +#### Model Fitting #### + +vae %>% fit( + x_train, x_train, + shuffle = TRUE, + epochs = epochs, + batch_size = batch_size, + validation_data = list(x_test, x_test) +) + + +#### Visualizations #### + +library(ggplot2) +library(dplyr) + +## display a 2D plot of the digit classes in the latent space +x_test_encoded <- predict(encoder, x_test, batch_size = batch_size) +x_test_encoded %>% + as_data_frame() %>% + mutate(class = as.factor(mnist$test$y)) %>% + ggplot(aes(x = V1, y = V2, colour = class)) + geom_point() + +## display a 2D manifold of the digits +n <- 15 # figure with 15x15 digits +digit_size <- 28 + +# we will sample n points within [-4, 4] standard deviations +grid_x <- seq(-4, 4, length.out = n) +grid_y <- seq(-4, 4, length.out = n) + +rows <- NULL +for(i in 1:length(grid_x)){ + column <- NULL + for(j in 1:length(grid_y)){ + z_sample <- matrix(c(grid_x[i], grid_y[j]), ncol = 2) + column <- rbind(column, predict(generator, z_sample) %>% matrix(ncol = digit_size)) + } + rows <- cbind(rows, column) +} +rows %>% as.raster() %>% plot() diff --git a/website/articles/examples/variational_autoencoder_deconv.html b/website/articles/examples/variational_autoencoder_deconv.html new file mode 100644 index 000000000..ec7bfed9b --- /dev/null +++ b/website/articles/examples/variational_autoencoder_deconv.html @@ -0,0 +1,353 @@ + + + + + + + +variational_autoencoder_deconv • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+ +

This script demonstrates how to build a variational autoencoder with Keras and deconvolution layers. Reference: “Auto-Encoding Variational Bayes” https://arxiv.org/abs/1312.6114

+
library(keras)
+K <- keras::backend()
+
+#### Parameterization ####
+
+# input image dimensions
+img_rows <- 28L
+img_cols <- 28L
+# color channels (1 = grayscale, 3 = RGB)
+img_chns <- 1L
+
+# number of convolutional filters to use
+filters <- 64L
+
+# convolution kernel size
+num_conv <- 3L
+
+latent_dim <- 2L
+intermediate_dim <- 128L
+epsilon_std <- 1.0
+
+# training parameters
+batch_size <- 100L
+epochs <- 5L
+
+
+#### Model Construction ####
+
+original_img_size <- c(img_rows, img_cols, img_chns)
+
+x <- layer_input(batch_shape = c(batch_size, original_img_size))
+
+conv_1 <- layer_conv_2d(
+  x,
+  filters = img_chns,
+  kernel_size = c(2L, 2L),
+  strides = c(1L, 1L),
+  padding = "same",
+  activation = "relu"
+)
+
+conv_2 <- layer_conv_2d(
+  conv_1,
+  filters = filters,
+  kernel_size = c(2L, 2L),
+  strides = c(2L, 2L),
+  padding = "same",
+  activation = "relu"
+)
+
+conv_3 <- layer_conv_2d(
+  conv_2,
+  filters = filters,
+  kernel_size = c(num_conv, num_conv),
+  strides = c(1L, 1L),
+  padding = "same",
+  activation = "relu"
+)
+
+conv_4 <- layer_conv_2d(
+  conv_3,
+  filters = filters,
+  kernel_size = c(num_conv, num_conv),
+  strides = c(1L, 1L),
+  padding = "same",
+  activation = "relu"
+)
+
+flat <- layer_flatten(conv_4)
+hidden <- layer_dense(flat, units = intermediate_dim, activation = "relu")
+
+z_mean <- layer_dense(hidden, units = latent_dim)
+z_log_var <- layer_dense(hidden, units = latent_dim)
+
+sampling <- function(args) {
+  z_mean <- args[, 0:(latent_dim - 1)]
+  z_log_var <- args[, latent_dim:(2 * latent_dim - 1)]
+  
+  epsilon <- K$random_normal(
+    shape = c(batch_size, latent_dim),
+    mean = 0.,
+    stddev = epsilon_std
+  )
+  z_mean + K$exp(z_log_var) * epsilon
+}
+
+z <- layer_concatenate(list(z_mean, z_log_var)) %>% layer_lambda(sampling)
+
+output_shape <- c(batch_size, 14L, 14L, filters)
+
+decoder_hidden <- layer_dense(units = intermediate_dim, activation = "relu")
+decoder_upsample <- layer_dense(units = prod(output_shape[-1]), activation = "relu")
+
+decoder_reshape <- layer_reshape(target_shape = output_shape[-1])
+decoder_deconv_1 <- layer_conv_2d_transpose(
+  filters = filters,
+  kernel_size = c(num_conv, num_conv),
+  strides = c(1L, 1L),
+  padding = "same",
+  activation = "relu"
+)
+
+decoder_deconv_2 <- layer_conv_2d_transpose(
+  filters = filters,
+  kernel_size = c(num_conv, num_conv),
+  strides = c(1L, 1L),
+  padding = "same",
+  activation = "relu"
+)
+
+decoder_deconv_3_upsample <- layer_conv_2d_transpose(
+  filters = filters,
+  kernel_size = c(3L, 3L),
+  strides = c(2L, 2L),
+  padding = "valid",
+  activation = "relu"
+)
+
+decoder_mean_squash <- layer_conv_2d(
+  filters = img_chns,
+  kernel_size = c(2L, 2L),
+  strides = c(1L, 1L),
+  padding = "valid",
+  activation = "sigmoid"
+)
+
+hidden_decoded <- decoder_hidden(z)
+up_decoded <- decoder_upsample(hidden_decoded)
+reshape_decoded <- decoder_reshape(up_decoded)
+deconv_1_decoded <- decoder_deconv_1(reshape_decoded)
+deconv_2_decoded <- decoder_deconv_2(deconv_1_decoded)
+x_decoded_relu <- decoder_deconv_3_upsample(deconv_2_decoded)
+x_decoded_mean_squash <- decoder_mean_squash(x_decoded_relu)
+
+# custom loss function
+vae_loss <- function(x, x_decoded_mean_squash) {
+  x <- K$flatten(x)
+  x_decoded_mean_squash <- K$flatten(x_decoded_mean_squash)
+  xent_loss <- 1.0 * img_rows * img_cols *
+    loss_binary_crossentropy(x, x_decoded_mean_squash)
+  kl_loss <- -0.5 * K$mean(1 + z_log_var - K$square(z_mean) -
+                           K$exp(z_log_var), axis = -1L)
+  K$mean(xent_loss + kl_loss)
+}
+
+## variational autoencoder
+vae <- keras_model(x, x_decoded_mean_squash)
+vae %>% compile(optimizer = "rmsprop", loss = vae_loss)
+summary(vae)
+
+## encoder: model to project inputs on the latent space
+encoder <- keras_model(x, z_mean)
+
+## build a digit generator that can sample from the learned distribution
+gen_decoder_input <- layer_input(shape = latent_dim)
+gen_hidden_decoded <- decoder_hidden(gen_decoder_input)
+gen_up_decoded <- decoder_upsample(gen_hidden_decoded)
+gen_reshape_decoded <- decoder_reshape(gen_up_decoded)
+gen_deconv_1_decoded <- decoder_deconv_1(gen_reshape_decoded)
+gen_deconv_2_decoded <- decoder_deconv_2(gen_deconv_1_decoded)
+gen_x_decoded_relu <- decoder_deconv_3_upsample(gen_deconv_2_decoded)
+gen_x_decoded_mean_squash <- decoder_mean_squash(gen_x_decoded_relu)
+generator <- keras_model(gen_decoder_input, gen_x_decoded_mean_squash)
+
+
+#### Data Preparation ####
+
+mnist <- dataset_mnist()
+data <- lapply(mnist, function(m) {
+  array(m$x / 255, dim = c(dim(m$x)[1], original_img_size))
+})
+x_train <- data$train
+x_test <- data$test
+
+
+#### Model Fitting ####
+
+vae %>% fit(
+  x_train, x_train, 
+  shuffle = TRUE, 
+  epochs = epochs, 
+  batch_size = batch_size, 
+  validation_data = list(x_test, x_test)
+)
+
+
+#### Visualizations ####
+
+library(ggplot2)
+library(dplyr)
+
+## display a 2D plot of the digit classes in the latent space
+x_test_encoded <- predict(encoder, x_test, batch_size = batch_size)
+x_test_encoded %>%
+  as_data_frame() %>%
+  mutate(class = as.factor(mnist$test$y)) %>%
+  ggplot(aes(x = V1, y = V2, colour = class)) + geom_point()
+
+## display a 2D manifold of the digits
+n <- 15  # figure with 15x15 digits
+digit_size <- 28
+
+# we will sample n points within [-4, 4] standard deviations
+grid_x <- seq(-4, 4, length.out = n)
+grid_y <- seq(-4, 4, length.out = n)
+
+rows <- NULL
+for(i in 1:length(grid_x)){
+  column <- NULL
+  for(j in 1:length(grid_y)){
+    z_sample <- matrix(c(grid_x[i], grid_y[j]), ncol = 2)
+    column <- rbind(column, predict(generator, z_sample) %>% matrix(ncol = digit_size))
+  }
+  rows <- cbind(rows, column)
+}
+rows %>% as.raster() %>% plot()
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/faq.html b/website/articles/faq.html new file mode 100644 index 000000000..df197a86d --- /dev/null +++ b/website/articles/faq.html @@ -0,0 +1,549 @@ + + + + + + + +Frequently Asked Questions • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+How should I cite Keras?

+

Please cite Keras in your publications if it helps your research. Here is an example BibTeX entry:

+
@misc{chollet2015keras,
+  title={Keras},
+  author={Chollet, Fran\c{c}ois and others},
+  year={2015},
+  publisher={GitHub},
+  howpublished={\url{https://github.com/fchollet/keras}},
+}
+
+
+

+What does “sample”, “batch”, “epoch” mean?

+

Below are some common definitions that are necessary to know and understand to correctly utilize Keras:

+
    +
  • +Sample: one element of a dataset.
  • +
  • +Example: one image is a sample in a convolutional network
  • +
  • +Example: one audio file is a sample for a speech recognition model
  • +
  • +Batch: a set of N samples. The samples in a batch are processed independently, in parallel. If training, a batch results in only one update to the model.
  • +
  • A batch generally approximates the distribution of the input data better than a single input. The larger the batch, the better the approximation; however, it is also true that the batch will take longer to processes and will still result in only one update. For inference (evaluate/predict), it is recommended to pick a batch size that is as large as you can afford without going out of memory (since larger batches will usually result in faster evaluating/prediction).
  • +
  • +Epoch: an arbitrary cutoff, generally defined as “one pass over the entire dataset”, used to separate training into distinct phases, which is useful for logging and periodic evaluation.
  • +
  • When using evaluation_data or evaluation_split with the fit method of Keras models, evaluation will be run at the end of every epoch.
  • +
  • Within Keras, there is the ability to add callbacks specifically designed to be run at the end of an epoch. Examples of these are learning rate changes and model checkpointing (saving).
  • +
+
+
+

+Why are Keras objects modified in place?

+

Unlike most R objects, Keras objects are “mutable”. That means that when you modify an object you’re modifying it “in place”, and you don’t need to assign the updated object back to the original name. For example, to add layers to a Keras model you might use this code:

+
model %>% 
+  layer_dense(units = 32, activation = 'relu', input_shape = c(784)) %>% 
+  layer_dense(units = 10, activation = 'softmax')
+

Rather than this code:

+
model <- model %>% 
+  layer_dense(units = 32, activation = 'relu', input_shape = c(784)) %>% 
+  layer_dense(units = 10, activation = 'softmax')
+

You need to be aware of this because it makes the Keras API a little different than most other pipelines you may have used, but it’s necessary to match the data structures and behavior of the underlying Keras library.

+
+
+

+How can I save a Keras model?

+

You can use save_model_hdf5() to save a Keras model into a single HDF5 file which will contain:

+
    +
  • the architecture of the model, allowing to re-create the model
  • +
  • the weights of the model
  • +
  • the training configuration (loss, optimizer)
  • +
  • the state of the optimizer, allowing to resume training exactly where you left off.
  • +
+

You can then use load_model_hdf5() to reinstantiate your model. load_model_hdf5() will also take care of compiling the model using the saved training configuration (unless the model was never compiled in the first place).

+

Example:

+
save_model_hdf5(model, 'my_model.h5')
+model <- load_model_hdf5('my_model.h5')
+

If you only need to save the architecture of a model, and not its weights or its training configuration, you can do:

+
json_string <- model_to_json(model)
+yaml_string <- model_to_yaml(model)
+

The generated JSON / YAML files are human-readable and can be manually edited if needed.

+

You can then build a fresh model from this data:

+
model <- model_from_json(json_string)
+model <- model_from_yaml(yaml_string)
+

If you need to save the weights of a model, you can do so in HDF5 with the code below.

+
save_model_weights_hdf5('my_model_weights.h5')
+

Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the same architecture:

+
model %>% load_model_weights_hdf5('my_model_weights.h5')
+

If you need to load weights into a different architecture (with some layers in common), for instance for fine-tuning or transfer-learning, you can load weights by layer name:

+
model %>% load_model_weights_hdf5('my_model_weights.h5', by_name = TRUE)
+

For example:

+
# assuming the original model looks like this:
+#   model <- keras_model_sequential()
+#   model %>% 
+#     layer_dense(units = 2, input_dim = 3, name = "dense 1") %>% 
+#     layer_dense(units = 3, name = "dense_3") %>% 
+#     ...
+#   save_model_weights(model, fname)
+
+# new model
+model <- keras_model_sequential()
+model %>% 
+  layer_dense(units = 2, input_dim = 3, name = "dense 1") %>%  # will be loaded
+  layer_dense(units = 3, name = "dense_3")                     # will not be loaded
+
+# load weights from first model; will only affect the first layer, dense_1.
+load_model_weights(fname, by_name = TRUE)
+
+
+

+Why is the training loss much higher than the testing loss?

+

A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.

+

Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.

+
+
+

+How can I obtain the output of an intermediate layer?

+

One simple way is to create a new Model that will output the layers that you are interested in:

+
model <- ...  # create the original model
+
+layer_name <- 'my_layer'
+intermediate_layer_model <- keras_model(inputs = model$input,
+                                        outputs = get_layer(layer_name)$output)
+intermediate_output <- predict(intermediate_layer_model, data)
+
+
+

+How can I use Keras with datasets that don’t fit in memory?

+
+

+Generator Functions

+

To provide training or evaluation data incrementally you can write an R generator function that yields batches of training data then pass the function to the fit_generator() function (or related functions evaluate_generator() and predict_generator().

+

The output of generator functions must be a list of one of these forms:

+
    +
  • (inputs, targets)
  • +
  • (inputs, targets, sample_weights)
  • +
+

All arrays should contain the same number of samples. The generator is expected to loop over its data indefinitely. For example, here’s simple generator function that yields randomly sampled batches of data:

+
sampling_generator <- function(X_data, Y_data, batch_size) {
+  function() {
+    rows <- sample(1:nrow(X_data), batch_size, replace = TRUE)
+    list(X_data[rows,], Y_data[rows,])
+  }
+}
+
+model %>% 
+  fit_generator(sampling_generator(X_train, Y_train, batch_size = 128), 
+                steps_per_epoch = nrow(X_train) / 128, epochs = 10)
+

The steps_per_epoch parameter indicates the number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of unique samples if your dataset divided by the batch size.

+
+
+

+External Data Generators

+

The above example doesn’t however address the use case of datasets that don’t fit in memory. Typically to do that you’ll write a generator that reads from another source (e.g. a sparse matrix or file(s) on disk) and maintains an offset into that data as it’s called repeatedly. For example, imagine you have a set of text files in a directory you want to read from:

+
data_files_generator <- function(dir) {
+  
+  files < list.files(dir)
+  next_file <- 0
+  
+  function() {
+    
+    # move to the next file (note the <<- assignment operator)
+    next_file <<- next_file + 1
+    
+    # determine the file name
+    file <- files[[next_file]]
+    
+    # process and return the data in the file
+    file_to_training_data(file)
+  }
+}
+

The above function is an example of a stateful generator—the function maintains information across calls to keep track of which data to provide next. This is accomplished by defining shared state outside the generator function body and using the <<- operator to assign to it from within the generator.

+
+
+

+Image Generators

+

You can also use the flow_images_from_directory() and flow_images_from_data() functions along with fit_generator() for training on sets of images stored on disk (with optional image augmentation/normalization via image_data_generator()).

+

You can see batch image training in action in our CIFAR10 example.

+
+
+

+Batch Functions

+

You can also do batch training using the train_on_batch() and test_on_batch() functions. These functions enable you to write a training loop that reads into memory only the data required for each batch.

+
+
+
+

+How can I interrupt training when the validation loss isn’t decreasing anymore?

+

You can use an early stopping callback:

+
early_stopping <- callback_early_stopping(monitor = 'val_loss', patience = 2)
+model %>% fit(X, y, validation_split = 0.2, callbacks = c(early_stopping))
+

Find out more in the callbacks documentation.

+
+
+

+How is the validation split computed?

+

If you set the validation_split argument in fit to e.g. 0.1, then the validation data used will be the last 10% of the data. If you set it to 0.25, it will be the last 25% of the data, etc. Note that the data isn’t shuffled before extracting the validation split, so the validation is literally just the last x% of samples in the input you passed.

+

The same validation set is used for all epochs (within a same call to fit).

+
+
+

+Is the data shuffled during training?

+

Yes, if the shuffle argument in fit is set to TRUE (which is the default), the training data will be randomly shuffled at each epoch.

+

Validation data is never shuffled.

+
+
+

+How can I record the training / validation loss / accuracy at each epoch?

+

The model.fit method returns an History callback, which has a history attribute containing the lists of successive losses and other metrics.

+
hist <- model %>% fit(X, y, validation_split=0.2)
+hist$history
+
+
+

+How can I “freeze” Keras layers?

+

To “freeze” a layer means to exclude it from training, i.e. its weights will never be updated. This is useful in the context of fine-tuning a model, or using fixed embeddings for a text input.

+

You can pass a trainable argument (boolean) to a layer constructor to set a layer to be non-trainable:

+
frozen_layer <- layer_dense(units = 32, trainable = FALSE)
+

Additionally, you can set the trainable property of a layer to TRUE or FALSE after instantiation. For this to take effect, you will need to call compile() on your model after modifying the trainable property. Here’s an example:

+
x <- layer_input(shape = c(32))
+layer <- layer_dense(units = 32)
+layer$trainable <- FALSE
+y <- x %>% layer
+
+frozen_model <- keras_model(x, y)
+# in the model below, the weights of `layer` will not be updated during training
+frozen_model %>% compile(optimizer = 'rmsprop', loss = 'mse')
+
+layer$trainable <- TRUE
+trainable_model <- keras_model(x, y)
+# with this model the weights of the layer will be updated during training
+# (which will also affect the above model since it uses the same layer instance)
+trainable_model %>% compile(optimizer = 'rmsprop', loss = 'mse')
+
+frozen_model %>% fit(data, labels)  # this does NOT update the weights of `layer`
+trainable_model %>% fit(data, labels)  # this updates the weights of `layer`
+
+
+

+How can I use stateful RNNs?

+

Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch.

+

When using stateful RNNs, it is therefore assumed that:

+
    +
  • all batches have the same number of samples
  • +
  • If X1 and X2 are successive batches of samples, then X2[[i]] is the follow-up sequence to X1[[i], for every i.
  • +
+

To use statefulness in RNNs, you need to:

+
    +
  • explicitly specify the batch size you are using, by passing a batch_size argument to the first layer in your model. E.g. batch_size=32 for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
  • +
  • set stateful=TRUE in your RNN layer(s).
  • +
  • specify shuffle=FALSE when calling fit().
  • +
+

To reset the states accumulated in either a singel layer or an entire model use the reset_states() function.

+

Notes that the methods predict(), fit(), train_on_batch(), predict_classes(), etc. will all update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction.

+
+
+

+How can I remove a layer from a Sequential model?

+

You can remove the last added layer in a Sequential model by calling pop_layer():

+
model <- keras_model_sequential()
+model %>% 
+  layer_dense(units = 32, activation = 'relu', input_shape = c(784)) %>% 
+  layer_dense(units = 32, activation = 'relu') %>% 
+  layer_dense(units = 32, activation = 'relu')
+
+length(model$layers)     # "3"
+model %>% pop_layer()
+length(model$layers)     # "2"
+
+
+

+How can I use pre-trained models in Keras?

+

Code and pre-trained weights are available for the following image classification models:

+ +

For example:

+
model <- application_vgg16(weights = 'imagenet', include_top = TRUE)
+

For a few simple usage examples, see the documentation for the Applications module.

+

The VGG16 model is also the basis for the Deep dream Keras example script.

+
+
+

+How can I use other Keras backends?

+

By default the Keras Python and R packages use the TensorFlow backend. Other available backends include Theano or CNTK. To learn more about using alternatate backends (e.g. Theano or CNTK) see the article on Keras backends.

+
+
+

+How can I run Keras on a GPU?

+

Note that installation and configuration of the GPU-based backends can take considerably more time and effort. So if you are just getting started with Keras you may want to stick with the CPU version initially, then install the appropriate GPU version once your training becomes more computationally demanding.

+

Below are instructions for installing and enabling GPU support for the various supported backends.

+
+

+TensorFlow

+

If your system has an NVIDIA® GPU and you have the GPU version of TensorFlow installed then your Keras code will automatically run on the GPU.

+

Additional details on GPU installation can be found here: https://tensorflow.rstudio.com/installation_gpu.html.

+
+
+

+Theano

+

If you are running on the Theano backend, you can set the THEANO_FLAGS environment variable to indicate you’d like to execute tensor operations on the GPU. For example:

+
Sys.setenv(KERAS_BACKEND = "keras")
+Sys.setenv(THEANO_FLAGS = "device=gpu,floatX=float32")
+library(keras)
+

The name ‘gpu’ might have to be changed depending on your device’s identifier (e.g. gpu0, gpu1, etc).

+
+
+

+CNTK

+

If you have the GPU version of CNTK installed then your Keras code will automatically run on the GPU.

+

Additional information on installing the GPU version of CNTK can be found here: https://docs.microsoft.com/en-us/cognitive-toolkit/setup-linux-python

+
+
+
+

+How can I use Keras in another R package?

+
+

+Testing on CRAN

+

The main consideration in using Keras within another R package is to ensure that your package can be tested in an environment where Keras is not available (e.g. the CRAN test servers). To do this, arrange for your tests to be skipped when Keras isn’t available using the is_keras_available() function.

+

For example, here’s a testthat utility function that can be used to skip a test when Keras isn’t available:

+
# testthat utilty for skipping tests when Keras isn't available
+skip_if_no_keras <- function(version = NULL) {
+  if (!is_keras_available(version))
+    skip("Required keras version not available for testing")
+}
+
+# use the function within a test
+test_that("keras function works correctly", {
+  skip_if_no_keras()
+  # test code here
+})
+

You can pass the version argument to check for a specific version of Keras.

+
+
+

+Keras Module

+

Another consideration is gaining access to the underlying Keras python module. You might need to do this if you require lower level access to Keras than is provided for by the Keras R package.

+

Since the Keras R package can bind to multiple different implementations of Keras (either the original Keras or the TensorFlow implementation of Keras), you should use the keras::implementation() function to obtain access to the correct python module. You can use this function within the .onLoad function of a package to provide global access to the module within your package. For example:

+
# Keras python module
+keras <- NULL
+
+# Obtain a reference to the module from the keras R package
+.onLoad <- function(libname, pkgname) {
+  keras <<- keras::implementation() 
+}
+
+
+

+Custom Layers

+

If you create custom layers in R or import other Python packages which include custom Keras layers, be sure to wrap them using the create_layer() function so that they are composable using the magrittr pipe operator. See the documentation on layer wrapper functions for additional details.

+
+
+
+

+How can I obtain reproducible results using Keras during development?

+

During development of a model, sometimes it is useful to be able to obtain reproducible results from run to run in order to determine if a change in performance is due to an actual model or data modification, or merely a result of a new random sample.

+

The below snippet of code provides an example of how to obtain reproducible results when using the TensorFlow backend. To do this we set the R session’s random seed, then manually construct a TensorFlow session (via the tensorflow package) and set it’s random seed, and then finally arrange for Keras to use this session within its backend.

+
library(keras)
+library(tensorflow)
+
+# Set R random seed
+set.seed(42L)
+
+# TensorFlow session configuration that uses only a single thread. Multiple threads are a 
+# potential source of non-reproducible results, see: https://stackoverflow.com/questions/42022950/which-seeds-have-to-be-set-where-to-realize-100-reproducibility-of-training-res
+session_conf <- tf$ConfigProto(intra_op_parallelism_threads = 1L, 
+                               inter_op_parallelism_threads = 1L)
+
+# Set TF random seed (see: https://www.tensorflow.org/api_docs/python/tf/set_random_seed)
+tf$set_random_seed(1042L)
+
+# Create the session using the custom configuration
+sess <- tf$Session(graph = tf$get_default_graph(), config = session_conf)
+
+# Instruct Keras to use this session
+K <- backend()
+K$set_session(sess)
+
+# Rest of code follows ...
+
+
+

+Where is the Keras configuration filed stored?

+

The default directory where all Keras data is stored is:

+
$HOME/.keras/
+

Note that Windows users should replace $HOME with %USERPROFILE%. In case Keras cannot create the above directory (e.g. due to permission issues), /tmp/.keras/ is used as a backup.

+

The Keras configuration file is a JSON file stored at $HOME/.keras/keras.json. The default configuration file looks like this:

+
{
+    "image_data_format": "channels_last",
+    "epsilon": 1e-07,
+    "floatx": "float32",
+    "backend": "tensorflow"
+}
+

It contains the following fields:

+
    +
  • The image data format to be used as default by image processing layers and utilities (either channels_last or channels_first).
  • +
  • The epsilon numerical fuzz factor to be used to prevent division by zero in some operations.
  • +
  • The default float data type.
  • +
  • The default backend (this will always be “tensorflow” in the R interface to Keras)
  • +
+

Likewise, cached dataset files, such as those downloaded with get_file(), are stored by default in $HOME/.keras/datasets/.

+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/functional_api.html b/website/articles/functional_api.html new file mode 100644 index 000000000..a7e00a461 --- /dev/null +++ b/website/articles/functional_api.html @@ -0,0 +1,491 @@ + + + + + + + +Guide to the Functional API • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+

The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.

+

This guide assumes that you are already familiar with the Sequential model.

+

Let’s start with something simple.

+
+

+First example: a densely-connected network

+

The Sequential model is probably a better choice to implement such a network, but it helps to start with something really simple.

+

To use the functional API, build your input and output layers and then pass them to the model() function. This model can be trained just like Keras sequential models.

+
library(keras)
+
+# input layer
+inputs <- layer_input(shape = c(784))
+ 
+# outputs compose input + dense layers
+predictions <- inputs %>%
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dense(units = 10, activation = 'softmax')
+
+# create and compile model
+model <- keras_model(inputs = inputs, outputs = predictions)
+model %>% compile(
+  optimizer = 'rmsprop',
+  loss = 'categorical_crossentropy',
+  metrics = c('accuracy')
+)
+

Note that Keras objects are modified in place which is why it’s not necessary for model to be assigned back to after it is compiled.

+
+
+

+All models are callable, just like layers

+

With the functional API, it is easy to re-use trained models: you can treat any model as if it were a layer. Note that you aren’t just re-using the architecture of the model, you are also re-using its weights.

+
x <- layer_input(shape = c(784))
+# This works, and returns the 10-way softmax we defined above.
+y <- x %>% model
+

This can allow, for instance, to quickly create models that can process sequences of inputs. You could turn an image classification model into a video classification model, in just one line:

+
# Input tensor for sequences of 20 timesteps,
+# each containing a 784-dimensional vector
+input_sequences <- layer_input(shape = c(20, 784))
+
+# This applies our previous model to the input sequence
+processed_sequences <- input_sequences %>%
+  time_distributed(model)
+
+
+

+Multi-input and multi-output models

+

Here’s a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams.

+

Let’s consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc.

+

The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models.

+

Here’s what our model looks like:

+

multi-input-multi-output-graph

+

Let’s implement it with the functional API.

+

The main input will receive the headline, as a sequence of integers (each integer encodes a word). The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long.

+

We’ll include an

+
library(keras)
+
+main_input <- layer_input(shape = c(100), dtype = 'int32', name = 'main_input')
+
+lstm_out <- main_input %>% 
+  layer_embedding(input_dim = 10000, output_dim = 512, input_length = 100) %>% 
+  layer_lstm(units = 32)
+

Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model:

+
auxiliary_output <- lstm_out %>% 
+  layer_dense(units = 1, activation = 'sigmoid', name = 'aux_output')
+

At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output, stacking a deep densely-connected network on top and adding the main logistic regression layer

+
auxiliary_input <- layer_input(shape = c(5), name = 'aux_input')
+
+main_output <- layer_concatenate(c(lstm_out, auxiliary_input)) %>%  
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dense(units = 1, activation = 'sigmoid', name = 'main_output')
+

This defines a model with two inputs and two outputs:

+
model <- keras_model(
+  inputs = c(main_input, auxiliary_input), 
+  outputs = c(main_output, auxiliary_output)
+)
+
summary(model)
+
Model
+__________________________________________________________________________________________
+Layer (type)                 Output Shape        Param #    Connected to                  
+==========================================================================================
+main_input (InputLayer)      (None, 100)         0                                        
+__________________________________________________________________________________________
+embedding_1 (Embedding)      (None, 100, 512)    5120000                                  
+__________________________________________________________________________________________
+lstm_1 (LSTM)                (None, 32)          69760                                    
+__________________________________________________________________________________________
+aux_input (InputLayer)       (None, 5)           0                                        
+__________________________________________________________________________________________
+concatenate_1 (Concatenate)  (None, 37)          0                                        
+__________________________________________________________________________________________
+dense_1 (Dense)              (None, 64)          2432                                     
+__________________________________________________________________________________________
+dense_2 (Dense)              (None, 64)          4160                                     
+__________________________________________________________________________________________
+dense_3 (Dense)              (None, 64)          4160                                     
+__________________________________________________________________________________________
+main_output (Dense)          (None, 1)           65                                       
+__________________________________________________________________________________________
+aux_output (Dense)           (None, 1)           33                                       
+==========================================================================================
+Total params: 5,200,610
+Trainable params: 5,200,610
+Non-trainable params: 0
+__________________________________________________________________________________________
+

We compile the model and assign a weight of 0.2 to the auxiliary loss. To specify different loss_weights or loss for each different output, you can use a list or a dictionary. Here we pass a single loss as the loss argument, so the same loss will be used on all outputs.

+
model %>% compile(
+  optimizer = 'rmsprop',
+  loss = 'binary_crossentropy',
+  loss_weights = c(1.0, 0.2)
+)
+

We can train the model by passing it lists of input arrays and target arrays:

+
model %>% fit(
+  x = list(headline_data, additional_data),
+  y = list(labels, labels),
+  epochs = 50,
+  batch_size = 32
+)
+

Since our inputs and outputs are named (we passed them a “name” argument), We could also have compiled the model via:

+
model %>% compile(
+  optimizer = 'rmsprop',
+  loss = list(main_output = 'binary_crossentropy', aux_output = 'binary_crossentropy'),
+  loss_weights = list(main_output = 1.0, aux_output = 0.2)
+)
+
+# And trained it via:
+model %>% fit(
+  x = list(main_input = headline_data, aux_input = additional_data),
+  y = list(main_output = labels, aux_output = labels),
+  epochs = 50,
+  batch_size = 32
+)
+
+
+

+Shared layers

+

Another good use for the functional API are models that use shared layers. Let’s take a look at shared layers.

+

Let’s consider a dataset of tweets. We want to build a model that can tell whether two tweets are from the same person or not (this can allow us to compare users by the similarity of their tweets, for instance).

+

One way to achieve this is to build a model that encodes two tweets into two vectors, concatenates the vectors and adds a logistic regression of top, outputting a probability that the two tweets share the same author. The model would then be trained on positive tweet pairs and negative tweet pairs.

+

Because the problem is symmetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets.

+

Let’s build this with the functional API. We will take as input for a tweet a binary matrix of shape (140, 256), i.e. a sequence of 140 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters).

+
library(keras)
+
+tweet_a <- layer_input(shape = c(140, 256))
+tweet_b <- layer_input(shape = c(140, 256))
+

To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want:

+
# This layer can take as input a matrix and will return a vector of size 64
+shared_lstm <- layer_lstm(units = 64)
+
+# When we reuse the same layer instance multiple times, the weights of the layer are also
+# being reused (it is effectively *the same* layer)
+encoded_a <- tweet_a %>% shared_lstm
+encoded_b <- tweet_b %>% shared_lstm
+
+# We can then concatenate the two vectors and add a logistic regression on top
+predictions <- layer_concatenate(c(encoded_a, encoded_b), axis=-1) %>% 
+  layer_dense(units = 1, activation = 'sigmoid')
+
+# We define a trainable model linking the tweet inputs to the predictions
+model <- keras_model(inputs = c(tweet_a, tweet_b), outputs = predictions)
+
+model %>% compile(
+  optimizer = 'rmsprop',
+  loss = 'binary_crossentropy',
+  metrics = c('accuracy')
+)
+
+model %>% fit(list(data_a, data_b), labels, epochs = 10)
+
+
+

+The concept of layer “node”

+

Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a “node” to the layer, linking the input tensor to the output tensor. When you are calling the same layer multiple times, that layer owns multiple nodes indexed as 1, 2, 2…

+

You can obtain the output tensor of a layer via layer$output, or its output shape via layer$output_shape. But what if a layer is connected to multiple inputs?

+

As long as a layer is only connected to one input, there is no confusion, and $output will return the one output of the layer:

+
a <- layer_input(shape = c(140, 256))
+
+lstm <- layer_lstm(units = 32)
+
+encoded_a <- a %>% lstm
+
+lstm$output
+

Not so if the layer has multiple inputs:

+
a <- layer_input(shape = c(140, 256))
+b <- layer_input(shape = c(140, 256))
+
+lstm <- layer_lstm(units = 32)
+
+encoded_a <- a %>% lstm
+encoded_b <- b %>% lstm
+
+lstm$output
+
AttributeError: Layer lstm_4 has multiple inbound nodes, hence the notion of "layer output" is ill-defined. Use `get_output_at(node_index)` instead.
+

Okay then. The following works:

+
get_output_at(lstm, 1)
+get_output_at(lstm, 2)
+

Simple enough, right?

+

The same is true for the properties input_shape and output_shape: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of “layer output/input shape” is well defined, and that one shape will be returned by layer$output_shape/layer$input_shape. But if, for instance, you apply the same layer_conv_2d() layer to an input of shape (32, 32, 3), and then to an input of shape (64, 64, 3), the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to:

+
a <- layer_input(shape = c(32, 32, 3))
+b <- layer_input(shape = c(64, 64, 3))
+
+conv <- layer_conv_2d(filters = 16, kernel_size = c(3,3), padding = 'same')
+
+conved_a <- a %>% conv
+
+# only one input so far, the following will work
+conv$input_shape
+
+conved_b <- b %>% conv
+# now the `$input_shape` property wouldn't work, but this does:
+get_input_shape_at(conv, 1)
+get_input_shape_at(conv, 2) 
+
+
+

+More examples

+

Code examples are still the best way to get started, so here are a few more.

+
+

+Inception module

+

For more information about the Inception architecture, see Going Deeper with Convolutions.

+
library(keras)
+
+input_img <- layer_input(shape = c(256, 256, 3))
+
+tower_1 <- input_img %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(1, 1), padding='same', activation='relu') %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3, 3), padding='same', activation='relu')
+
+tower_2 <- input_img %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(1, 1), padding='same', activation='relu') %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(5, 5), padding='same', activation='relu')
+
+tower_3 <- input_img %>% 
+  layer_max_pooling_2d(pool_size = c(3, 3), strides = c(1, 1), padding = 'same') %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(1, 1), padding='same', activation='relu')
+
+output <- layer_concatenate(c(tower_1, tower_2, tower_3), axis = 1)
+
+
+

+Residual connection on a convolution layer

+

For more information about residual networks, see Deep Residual Learning for Image Recognition.

+
# input tensor for a 3-channel 256x256 image
+x <- layer_input(shape = c(256, 256, 3))
+# 3x3 conv with 3 output channels (same as input channels)
+y <- x %>% layer_conv_2d(filters = 3, kernel_size =c(3, 3), padding = 'same')
+# this returns x + y.
+z <- layer_add(c(x, y))
+
+
+

+Shared vision model

+

This model re-uses the same image-processing module on two inputs, to classify whether two MNIST digits are the same digit or different digits.

+
# First, define the vision model
+digit_input <- layer_input(shape = c(27, 27, 1))
+out <- digit_input %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3, 3)) %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3, 3)) %>% 
+  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
+  layer_flatten()
+
+vision_model <- keras_model(digit_input, out)
+
+# Then define the tell-digits-apart model
+digit_a <- layer_input(shape = c(27, 27, 1))
+digit_b <- layer_input(shape = c(27, 27, 1))
+
+# The vision model will be shared, weights and all
+out_a <- digit_a %>% vision_model
+out_b <- digit_b %>% vision_model
+
+out <- layer_concatenate(c(out_a, out_b)) %>% 
+  layer_dense(units = 1, activation = 'sigmoid')
+
+classification_model <- keras_model(inputs = c(digit_a, digit_b), out)
+
+
+

+Visual question answering model

+

This model can select the correct one-word answer when asked a natural-language question about a picture.

+

It works by encoding the question into a vector, encoding the image into a vector, concatenating the two, and training on top a logistic regression over some vocabulary of potential answers.

+
# First, let's define a vision model using a Sequential model.
+# This model will encode an image into a vector.
+vision_model <- keras_model_sequential() 
+vision_model %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = 'relu', padding = 'same',
+                input_shape = c(224, 224, 3)) %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
+  layer_conv_2d(filters = 128, kernel_size = c(3, 3), activation = 'relu', padding = 'same') %>% 
+  layer_conv_2d(filters = 128, kernel_size = c(3, 3), activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
+  layer_conv_2d(filters = 256, kernel_size = c(3, 3), activation = 'relu', padding = 'same') %>% 
+  layer_conv_2d(filters = 256, kernel_size = c(3, 3), activation = 'relu') %>% 
+  layer_conv_2d(filters = 256, kernel_size = c(3, 3), activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
+  layer_flatten()
+
+# Now let's get a tensor with the output of our vision model:
+image_input <- layer_input(shape = c(224, 224, 3))
+encoded_image <- image_input %>% vision_model
+
+# Next, let's define a language model to encode the question into a vector.
+# Each question will be at most 100 word long,
+# and we will index words as integers from 1 to 9999.
+question_input <- layer_input(shape = c(100), dtype = 'int32')
+encoded_question <- question_input %>% 
+  layer_embedding(input_dim = 10000, output_dim = 256, input_length = 100) %>% 
+  layer_lstm(units = 256)
+
+# Let's concatenate the question vector and the image vector then
+# train a logistic regression over 1000 words on top
+output <- layer_concatenate(c(encoded_question, encoded_image)) %>% 
+  layer_dense(units = 1000, activation='softmax')
+
+# This is our final model:
+vqa_model <- keras_model(inputs = c(image_input, question_input), outputs = output)
+
+
+

+Video question answering model

+

Now that we have trained our image QA model, we can quickly turn it into a video QA model. With appropriate training, you will be able to show it a short video (e.g. 100-frame human action) and ask a natural language question about the video (e.g. “what sport is the boy playing?” -> “football”).

+
video_input <- layer_input(shape = c(100, 224, 224, 3))
+
+# This is our video encoded via the previously trained vision_model (weights are reused)
+encoded_video <- video_input %>% 
+  time_distributed(vision_model) %>% 
+  layer_lstm(units = 256)
+
+# This is a model-level representation of the question encoder, reusing the same weights as before:
+question_encoder <- keras_model(inputs = question_input, outputs = encoded_question)
+
+# Let's use it to encode the question:
+video_question_input <- layer_input(shape = c(100), dtype = 'int32')
+encoded_video_question <- video_question_input %>% question_encoder
+
+# And this is our video question answering model:
+output <- layer_concatenate(c(encoded_video, encoded_video_question)) %>% 
+  layer_dense(units = 1000, activation = 'softmax')
+
+video_qa_model <- keras_model(inputs= c(video_input, video_question_input), outputs = output)
+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/getting_started.html b/website/articles/getting_started.html new file mode 100644 index 000000000..32f782fbe --- /dev/null +++ b/website/articles/getting_started.html @@ -0,0 +1,269 @@ + + + + + + + +Getting Started with Keras • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Overview

+

Keras is a high-level neural networks API developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. Keras has the following key features:

+
    +
  • Allows the same code to run on CPU or on GPU, seamlessly.

  • +
  • User-friendly API which makes it easy to quickly prototype deep learning models.

  • +
  • Built-in support for convolutional networks (for computer vision), recurrent networks (for sequence processing), and any combination of both.

  • +
  • Supports arbitrary network architectures: multi-input or multi-output models, layer sharing, model sharing, etc. This means that Keras is appropriate for building essentially any deep learning model, from a memory network to a neural Turing machine.

  • +
  • Is capable of running on top of multiple back-ends including TensorFlow, CNTK, or Theano.

  • +
+

This website provides documentation for the R interface to Keras. See the main Keras website at https://keras.io for additional information on the project.

+
+
+

+Installation

+

First, install the keras R package from CRAN as follows:

+
install.packages("keras")
+

The Keras R interface uses the TensorFlow backend engine by default. To install both the core Keras library as well as the TensorFlow backend use the install_keras() function:

+
library(keras)
+install_keras()
+

This will provide you with default installations of Keras and TensorFlow. If you want to do a more customized installation of TensorFlow (including installing a version that takes advantage of Nvidia GPUs if you have the correct CUDA libraries installed) see the documentation for install_keras().

+
+
+

+MNIST Example

+

We can learn the basics of Keras by walking through a simple example: recognizing handwritten digets from the MNIST dataset. MNIST consists of 28 x 28 grayscale images of handwritten digits like these:

+

+

The dataset also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1.

+
+

+Preparing the Data

+

The MNIST dataset is included with Keras and can be accessed using the dataset_mnist() function. Here we load the dataset then create variables for our test and training data:

+
library(keras)
+mnist <- dataset_mnist()
+x_train <- mnist$train$x
+y_train <- mnist$train$y
+x_test <- mnist$test$x
+y_test <- mnist$test$y
+

The x data is a 3-d array (images,width,height) of grayscale values . To prepare the data for training we convert the 3-d arrays into matrices by reshaping width and height into a single dimension (28x28 images are flattened into length 784 vectors). Then, we convert the grayscale values from integers ranging between 0 to 255 into floating point values ranging between 0 and 1:

+
# reshape
+dim(x_train) <- c(nrow(x_train), 784)
+dim(x_test) <- c(nrow(x_test), 784)
+# rescale
+x_train <- x_train / 255
+x_test <- x_test / 255
+

The y data is an integer vector with values ranging from 0 to 9. To prepare this data for training we one-hot encode the vectors into binary class matrices using the Keras to_categorical() function:

+
y_train <- to_categorical(y_train, 10)
+y_test <- to_categorical(y_test, 10)
+
+
+

+Defining the Model

+

The core data structure of Keras is a model, a way to organize layers. The simplest type of model is the Sequential model, a linear stack of layers.

+

We begin by creating a sequential model and then adding layers using the pipe (%>%) operator:

+
library(keras)
+model <- keras_model_sequential() 
+model %>% 
+  layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>% 
+  layer_dropout(rate = 0.4) %>% 
+  layer_dense(units = 128, activation = 'relu') %>%
+  layer_dropout(rate = 0.3) %>%
+  layer_dense(units = 10, activation = 'softmax')
+

The input_shape argument to the first layer specifies the shape of the input data (a length 784 numeric vector representing a grayscale image). The final layer outputs a length 10 numeric vector (probabilities for each digit) using a softmax activation function.

+

Use the summary() function to print the details of the model:

+
summary(model)
+
Model
+________________________________________________________________________________
+Layer (type)                        Output Shape                    Param #     
+================================================================================
+dense_1 (Dense)                     (None, 256)                     200960      
+________________________________________________________________________________
+dropout_1 (Dropout)                 (None, 256)                     0           
+________________________________________________________________________________
+dense_2 (Dense)                     (None, 128)                     32896       
+________________________________________________________________________________
+dropout_2 (Dropout)                 (None, 128)                     0           
+________________________________________________________________________________
+dense_3 (Dense)                     (None, 10)                      1290        
+================================================================================
+Total params: 235,146
+Trainable params: 235,146
+Non-trainable params: 0
+________________________________________________________________________________
+

Next, compile the model with appropriate loss function, optimizer, and metrics:

+
model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = optimizer_rmsprop(),
+  metrics = c('accuracy')
+)
+
+
+

+Training and Evaluation

+

Use the fit() function to train the model for 30 epochs using batches of 128 images:

+
history <- model %>% fit(
+  x_train, y_train, 
+  epochs = 30, batch_size = 128, 
+  validation_split = 0.2
+)
+

The history object returned by fit() includes loss and accuracy metrics which we can plot:

+
plot(history)
+
+ +
+

Evaluate the model’s performance on the test data:

+
loss_and_metrics <- model %>% evaluate(x_test, y_test)
+

Generate predictions on new data:

+
classes <- model %>% predict_classes(x_test)
+

Keras provides a vocabulary for building deep learning models that is simple, elegant, and intuitive. Building a question answering system, an image classification model, a neural Turing machine, or any other model is just as straightforward.

+
+
+
+

+Learning More

+

To learn more about Keras, see these other package vignettes:

+ +

The examples demonstrate more advanced models including transfer learning, variational auto-encoding, question-answering with memory networks, text generation with stacked LSTMs, etc.

+

The function reference includes detailed information on all of the functions available in the package.

+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/images/MNIST.png b/website/articles/images/MNIST.png new file mode 100644 index 000000000..0a558b744 Binary files /dev/null and b/website/articles/images/MNIST.png differ diff --git a/website/articles/images/multi-input-multi-output-graph.png b/website/articles/images/multi-input-multi-output-graph.png new file mode 100644 index 000000000..179f21ed4 Binary files /dev/null and b/website/articles/images/multi-input-multi-output-graph.png differ diff --git a/website/articles/images/regular_stacked_lstm.png b/website/articles/images/regular_stacked_lstm.png new file mode 100644 index 000000000..c59ce194a Binary files /dev/null and b/website/articles/images/regular_stacked_lstm.png differ diff --git a/website/articles/images/tensorboard.png b/website/articles/images/tensorboard.png new file mode 100644 index 000000000..af2c3db85 Binary files /dev/null and b/website/articles/images/tensorboard.png differ diff --git a/website/articles/images/tensorboard_compare.png b/website/articles/images/tensorboard_compare.png new file mode 100644 index 000000000..7c7845858 Binary files /dev/null and b/website/articles/images/tensorboard_compare.png differ diff --git a/website/articles/images/training_history_ggplot2.png b/website/articles/images/training_history_ggplot2.png new file mode 100644 index 000000000..bd87f365d Binary files /dev/null and b/website/articles/images/training_history_ggplot2.png differ diff --git a/website/articles/sequential_model.html b/website/articles/sequential_model.html new file mode 100644 index 000000000..1e307996e --- /dev/null +++ b/website/articles/sequential_model.html @@ -0,0 +1,554 @@ + + + + + + + +Guide to the Sequential Model • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Defining a Model

+

The sequential model is a linear stack of layers.

+

You create a sequential model by calling the keras_model_sequential() function then a series of layer functions:

+
library(keras)
+
+model <- keras_model_sequential() 
+model %>% 
+  layer_dense(units = 32, input_shape = c(784)) %>% 
+  layer_activation('relu') %>% 
+  layer_dense(units = 10) %>% 
+  layer_activation('softmax')
+

Note that Keras objects are modified in place which is why it’s not necessary for model to be assigned back to after the layers are added.

+

Print a summary of the model’s structure using the summary() function:

+
summary(model)
+
Model
+________________________________________________________________________________
+Layer (type)                        Output Shape                    Param #     
+================================================================================
+dense_1 (Dense)                     (None, 256)                     200960      
+________________________________________________________________________________
+dropout_1 (Dropout)                 (None, 256)                     0           
+________________________________________________________________________________
+dense_2 (Dense)                     (None, 128)                     32896       
+________________________________________________________________________________
+dropout_2 (Dropout)                 (None, 128)                     0           
+________________________________________________________________________________
+dense_3 (Dense)                     (None, 10)                      1290        
+================================================================================
+Total params: 235,146
+Trainable params: 235,146
+Non-trainable params: 0
+________________________________________________________________________________
+
+

+Input Shapes

+

The model needs to know what input shape it should expect. For this reason, the first layer in a sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape.

+

As illustrated in the example above, this is done by passing an input_shape argument to the first layer. This is a list of integers or NULL entries, where NULL indicates that any positive integer may be expected. In input_shape, the batch dimension is not included.

+

If you ever need to specify a fixed batch size for your inputs (this is useful for stateful recurrent networks), you can pass a batch_size argument to a layer. If you pass both batch_size=32 and input_shape=c(6, 8) to a layer, it will then expect every batch of inputs to have the batch shape (32, 6, 8).

+
+
+
+

+Compilation

+

Before training a model, you need to configure the learning process, which is done via the compile() function. It receives three arguments:

+
    +
  • An optimizer. This could be the string identifier of an existing optimizer (e.g. as “rmsprop” or “adagrad”) or a call to an optimizer function (e.g. optimizer_sgd()).

  • +
  • A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (e.g. “categorical_crossentropy” or “mse”) or a call to a loss function (e.g. loss_mean_squared_error()).

  • +
  • A list of metrics. For any classification problem you will want to set this to metrics = c('accuracy'). A metric could be the string identifier of an existing metric or a call to metric function (e.g. metric_binary_crossentropy()).

  • +
+

Here’s the definition of a model along with the compilation step (the compile() function has arguments appropriate for a a multi-class classification problem):

+
# For a multi-class classification problem
+model <- keras_model_sequential() 
+model %>% 
+  layer_dense(units = 32, input_shape = c(784)) %>% 
+  layer_activation('relu') %>% 
+  layer_dense(units = 10) %>% 
+  layer_activation('softmax')
+
+model %>% compile(
+  optimizer = 'rmsprop',
+  loss = 'categorical_crossentropy',
+  metrics = c('accuracy')
+)
+

Here’s what compilation might look like for a mean squared error regression problem:

+
model %>% compile(
+  optimizer = optimizer_rmsprop(lr = 0.002),
+  loss = 'mse'
+)
+

Here’s compilation for a binary classification problem:

+
model %>% compile( 
+  optimizer = optimizer_rmsprop(),
+  loss = loss_binary_crossentropy,
+  metrics = metric_binary_accuracy
+)
+

Here’s compilation with a custom metric:

+
# create metric using backend tensor functions
+K <- backend()
+metric_mean_pred <- function(y_true, y_pred) {
+  K$mean(y_pred) 
+}
+
+model %>% compile( 
+  optimizer = optimizer_rmsprop(),
+  loss = loss_binary_crossentropy,
+  metrics = c('accuracy', 
+              'mean_pred' = metric_mean_pred)
+)
+
+
+

+Training

+

Keras models are trained on R matrixes or higher dimensional arrays of input data and labels. For training a model, you will typically use the fit() function.

+

Here’s a single-input model with 2 classes (binary classification):

+
# create model
+model <- keras_model_sequential()
+
+# add layers and compile the model
+model %>% 
+  layer_dense(units = 32, activation = 'relu', input_shape = c(100)) %>% 
+  layer_dense(units = 1, activation = 'sigmoid') %>% 
+  compile(
+    optimizer = 'rmsprop',
+    loss = 'binary_crossentropy',
+    metrics = c('accuracy')
+  )
+
+# Generate dummy data
+data <- matrix(runif(1000*100), nrow = 1000, ncol = 100)
+labels <- matrix(round(runif(1000, min = 0, max = 1)), nrow = 1000, ncol = 1)
+
+# Train the model, iterating on the data in batches of 32 samples
+model %>% fit(data, labels, epochs=10, batch_size=32)
+

Here’s a single-input model with 10 classes (categorical classification):

+
# create model
+model <- keras_model_sequential()
+
+# define and compile the model
+model %>% 
+  layer_dense(units = 32, activation = 'relu', input_shape = c(100)) %>% 
+  layer_dense(units = 10, activation = 'softmax') %>% 
+  compile(
+    optimizer = 'rmsprop',
+    loss = 'categorical_crossentropy',
+    metrics = c('accuracy')
+  )
+
+# Generate dummy data
+data <- matrix(runif(1000*100), nrow = 1000, ncol = 100)
+labels <- matrix(round(runif(1000, min = 0, max = 9)), nrow = 1000, ncol = 1)
+
+# Convert labels to categorical one-hot encoding
+one_hot_labels <- to_categorical(labels, num_classes = 10)
+
+# Train the model, iterating on the data in batches of 32 samples
+model %>% fit(data, one_hot_labels, epochs=10, batch_size=32)
+
+
+

+Examples

+

Here are a few examples to get you started!

+

On the examples page you will also find example models for real datasets:

+ +

Some additional examples are provided below.

+
+

+Multilayer Perceptron (MLP) for multi-class softmax classification

+
library(keras)
+
+# generate dummy data
+x_train <- matrix(runif(1000*20), nrow = 1000, ncol = 20)
+
+y_train <- runif(1000, min = 0, max = 9) %>% 
+  round() %>%
+  matrix(nrow = 1000, ncol = 1) %>% 
+  to_categorical(num_classes = 10)
+
+x_test  <- matrix(runif(100*20), nrow = 100, ncol = 20)
+
+y_test <- runif(100, min = 0, max = 9) %>% 
+  round() %>%
+  matrix(nrow = 100, ncol = 1) %>% 
+  to_categorical(num_classes = 10)
+
+# create model
+model <- keras_model_sequential()
+
+# define and compile the model
+model %>% 
+  layer_dense(units = 64, activation = 'relu', input_shape = c(20)) %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 10, activation = 'softmax') %>% 
+  compile(
+    loss = 'categorical_crossentropy',
+    optimizer = optimizer_sgd(lr = 0.01, decay = 1e-6, momentum = 0.9, nesterov = TRUE),
+    metrics = c('accuracy')     
+  )
+
+# train
+model %>% fit(x_train, y_train, epochs = 20, batch_size = 128)
+
+# evaluate
+score <- model %>% evaluate(x_test, y_test, batch_size = 128)
+
+
+

+MLP for binary classification

+
library(keras)
+
+# generate dummy data
+x_train <- matrix(runif(1000*20), nrow = 1000, ncol = 20)
+y_train <- matrix(round(runif(1000, min = 0, max = 1)), nrow = 1000, ncol = 1)
+x_test <- matrix(runif(100*20), nrow = 100, ncol = 20)
+y_test <- matrix(round(runif(100, min = 0, max = 1)), nrow = 100, ncol = 1)
+
+# create model
+model <- keras_model_sequential()
+
+# define and compile the model
+model %>% 
+  layer_dense(units = 64, activation = 'relu', input_shape = c(20)) %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 64, activation = 'relu') %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 1, activation = 'sigmoid') %>% 
+  compile(
+    loss = 'binary_crossentropy',
+    optimizer = 'rmsprop',
+    metrics = c('accuracy')
+  )
+
+# train 
+model %>% fit(x_train, y_train, epochs = 20, batch_size = 128)
+
+# evaluate
+score = model %>% evaluate(x_test, y_test, batch_size=128)
+
+
+

+VGG-like convnet

+
library(keras)
+
+# generate dummy data
+x_train <- array(runif(100 * 100 * 100 * 3), dim = c(100, 100, 100, 3))
+
+y_train <- runif(100, min = 0, max = 9) %>% 
+  round() %>%
+  matrix(nrow = 100, ncol = 1) %>% 
+  to_categorical(num_classes = 10)
+
+x_test <- array(runif(20 * 100 * 100 * 3), dim = c(20, 100, 100, 3))
+
+y_test <- runif(20, min = 0, max = 9) %>% 
+  round() %>%
+  matrix(nrow = 20, ncol = 1) %>% 
+  to_categorical(num_classes = 10)
+
+# create model
+model <- keras_model_sequential()
+
+# define and compile model
+# input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
+# this applies 32 convolution filters of size 3x3 each.
+model %>% 
+  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = 'relu', 
+                input_shape = c(100,100,3)) %>% 
+  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
+  layer_dropout(rate = 0.25) %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% 
+  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% 
+  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
+  layer_dropout(rate = 0.25) %>% 
+  layer_flatten() %>% 
+  layer_dense(units = 256, activation = 'relu') %>% 
+  layer_dropout(rate = 0.25) %>% 
+  layer_dense(units = 10, activation = 'softmax') %>% 
+  compile(
+    loss = 'categorical_crossentropy', 
+    optimizer = optimizer_sgd(lr = 0.01, decay = 1e-6, momentum = 0.9, nesterov = TRUE)
+  )
+  
+# train
+model %>% fit(x_train, y_train, batch_size = 32, epochs = 10)
+
+# evaluate
+score <- model %>% evaluate(x_test, y_test, batch_size = 32)
+
+
+

+Sequence classification with LSTM

+
model <- keras_model_sequential() 
+model %>% 
+  layer_embedding(input_dim = max_features, output_dim - 256) %>% 
+  layer_lstm(units = 128) %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 1, activation = 'sigmoid') %>% 
+  compile(
+    loss = 'binary_crossentropy',
+    optimizer = 'rmsprop',
+    metrics = c('accuracy')
+  )
+
+model %>% fit(x_train, y_train, batch_size = 16, epochs = 10)
+score <- model %>% evaluate(x_test, y_test, batch_size = 16)
+
+
+

+Sequence classification with 1D convolutions:

+
model <- keras_model_sequential()
+model %>% 
+  layer_conv_1d(filters = 64, kernel_size = 3, activation = 'relu',
+                input_shape = c(seq_length, 100)) %>% 
+  layer_conv_1d(filters = 64, kernel_size = 3, activation = 'relu') %>% 
+  layer_max_pooling_1d(pool_size = 3) %>% 
+  layer_conv_1d(filters = 128, kernel_size = 3, activation = 'relu') %>% 
+  layer_conv_1d(filters = 128, kernel_size = 3, activation = 'relu') %>% 
+  layer_global_average_pooling_1d() %>% 
+  layer_dropout(rate = 0.5) %>% 
+  layer_dense(units = 1, activation = 'sigmoid') %>% 
+  compile(
+    loss = 'binary_crossentropy',
+    optimizer = 'rmsprop',
+    metrics = c('accuracy')
+  )
+
+model %>% fit(x_train, y_train, batch_size = 16, epochs = 10)
+score <- model %>% evaluate(x_test, y_test, batch_size = 16)
+
+
+

+Stacked LSTM for sequence classification

+

In this model, we stack 3 LSTM layers on top of each other, making the model capable of learning higher-level temporal representations.

+

The first two LSTMs return their full output sequences, but the last one only returns the last step in its output sequence, thus dropping the temporal dimension (i.e. converting the input sequence into a single vector).

+

stacked LSTM

+
library(keras)
+
+# constants
+data_dim <- 16
+timesteps <- 8
+num_classes <- 10
+
+# define and compile model
+# expected input data shape: (batch_size, timesteps, data_dim)
+model <- keras_model_sequential() 
+model %>% 
+  layer_lstm(units = 32, return_sequences = TRUE, input_shape = c(timesteps, data_dim)) %>% 
+  layer_lstm(units = 32, return_sequences = TRUE) %>% 
+  layer_lstm(units = 32) %>% # return a single vector dimension 32
+  layer_dense(units = 10, activation = 'softmax') %>% 
+  compile(
+    loss = 'categorical_crossentropy',
+    optimizer = 'rmsprop',
+    metrics = c('accuracy')
+  )
+  
+# generate dummy training data
+x_train <- array(runif(1000 * timesteps * data_dim), dim = c(1000, timesteps, data_dim))
+y_train <- matrix(runif(1000 * num_classes), nrow = 1000, ncol = num_classes)
+
+# generate dummy validation data
+x_val <- array(runif(100 * timesteps * data_dim), dim = c(100, timesteps, data_dim))
+y_val <- matrix(runif(100 * num_classes), nrow = 100, ncol = num_classes)
+
+# train
+model %>% fit( 
+  x_train, y_train, batch_size = 64, epochs = 5, validation_data = list(x_val, y_val)
+)
+
+
+

+Same stacked LSTM model, rendered “stateful”

+

A stateful recurrent model is one for which the internal states (memories) obtained after processing a batch of samples are reused as initial states for the samples of the next batch. This allows to process longer sequences while keeping computational complexity manageable.

+

You can read more about stateful RNNs in the FAQ.

+
library(keras)
+
+# constants
+data_dim <- 16
+timesteps <- 8
+num_classes <- 10
+batch_size <- 32
+
+# define and compile model
+# Expected input batch shape: (batch_size, timesteps, data_dim)
+# Note that we have to provide the full batch_input_shape since the network is stateful.
+# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
+model <- keras_model_sequential()
+model %>% 
+  layer_lstm(units = 32, return_sequences = TRUE, stateful = TRUE,
+             batch_input_shape = c(batch_size, timesteps, data_dim)) %>% 
+  layer_lstm(units = 32, return_sequences = TRUE, stateful = TRUE) %>% 
+  layer_lstm(units = 32, stateful = TRUE) %>% 
+  layer_dense(units = 10, activation = 'softmax') %>% 
+  compile(
+    loss = 'categorical_crossentropy',
+    optimizer = 'rmsprop',
+    metrics = c('accuracy')
+  )
+  
+# generate dummy training data
+x_train <- array(runif( (batch_size * 10) * timesteps * data_dim), 
+                 dim = c(batch_size * 10, timesteps, data_dim))
+y_train <- matrix(runif( (batch_size * 10) * num_classes), 
+                  nrow = batch_size * 10, ncol = num_classes)
+
+# generate dummy validation data
+x_val <- array(runif( (batch_size * 3) * timesteps * data_dim), 
+               dim = c(batch_size * 3, timesteps, data_dim))
+y_val <- matrix(runif( (batch_size * 3) * num_classes), 
+                nrow = batch_size * 3, ncol = num_classes)
+
+# train
+model %>% fit( 
+  x_train, 
+  y_train, 
+  batch_size = batch_size, 
+  epochs = 5, 
+  shuffle = FALSE,
+  validation_data = list(x_val, y_val)
+)
+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/training_callbacks.html b/website/articles/training_callbacks.html new file mode 100644 index 000000000..e88417a1a --- /dev/null +++ b/website/articles/training_callbacks.html @@ -0,0 +1,361 @@ + + + + + + + +Training Callbacks • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Overview

+

A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument callbacks) to the fit() function. The relevant methods of the callbacks will then be called at each stage of the training.

+

For example:

+
library(keras)
+
+# generate dummy training data
+data <- matrix(rexp(1000*784), nrow = 1000, ncol = 784)
+labels <- matrix(round(runif(1000*10, min = 0, max = 9)), nrow = 1000, ncol = 10)
+
+# create model
+model <- keras_model_sequential() 
+
+# add layers and compile
+model %>%
+  layer_dense(32, input_shape = c(784)) %>%
+  layer_activation('relu') %>%
+  layer_dense(10) %>%
+  layer_activation('softmax') %>% 
+  compile(
+    loss='binary_crossentropy',
+    optimizer = optimizer_sgd(),
+    metrics='accuracy'
+  )
+  
+# fit with callbacks
+model %>% fit(data, labels, callbacks = list(
+  callback_model_checkpoint("checkpoints.h5"),
+  callback_reduce_lr_on_plateau(monitor = "val_loss", factor = 0.1)
+))
+
+
+

+Built in Callbacks

+

The following built-in callbacks are available as part of Keras:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+callback_progbar_logger() + +

+Callback that prints metrics to stdout. +

+
+callback_model_checkpoint() + +

+Save the model after every epoch. +

+
+callback_early_stopping() + +

+Stop training when a monitored quantity has stopped improving. +

+
+callback_remote_monitor() + +

+Callback used to stream events to a server. +

+
+callback_learning_rate_scheduler() + +

+Learning rate scheduler. +

+
+callback_tensorboard() + +

+TensorBoard basic visualizations +

+
+callback_reduce_lr_on_plateau() + +

+Reduce learning rate when a metric has stopped improving. +

+
+callback_csv_logger() + +

+Callback that streams epoch results to a csv file +

+
+callback_lambda() + +

+Create a custom callback +

+
+
+
+

+Custom Callbacks

+

You can create a custom callback by creating a new R6 class that inherits from the KerasCallback class.

+

Here’s a simple example saving a list of losses over each batch during training:

+
library(keras)
+
+# define custom callback class
+LossHistory <- R6::R6Class("LossHistory",
+  inherit = KerasCallback,
+  
+  public = list(
+    
+    losses = NULL,
+     
+    on_batch_end = function(batch, logs = list()) {
+      self$losses <- c(self$losses, logs[["loss"]])
+    }
+))
+
+# define model
+model <- keras_model_sequential() 
+
+# add layers and compile
+model %>% 
+  layer_dense(units = 10, input_shape = c(784)) %>% 
+  layer_activation(activation = 'softmax') %>% 
+  compile(
+    loss = 'categorical_crossentropy', 
+    optimizer = 'rmsprop'
+  )
+
+# create history callback object and use it during training
+history <- LossHistory$new()
+model %>% fit(
+  X_train, Y_train,
+  batch_size=128, epochs=20, verbose=0,
+  callbacks= list(history)
+)
+
+# print the accumulated losses
+history$losses
+
[1] 0.6604760 0.3547246 0.2595316 0.2590170 ...
+
+

+Fields

+

Custom callback objects have access to the current model and it’s training parameters via the following fields:

+
+
self$params
+
+

Named list with training parameters (eg. verbosity, batch size, number of epochs…).

+
+
self$model
+
+

Reference to the Keras model being trained.

+
+
+
+
+

+Methods

+

Custom callback objects can implement one or more of the following methods:

+
+
on_epoch_begin(epoch, logs)
+
+

Called at the beginning of each epoch.

+
+
on_epoch_end(epoch, logs)
+
+

Called at the end of each epoch.

+
+
on_batch_begin(batch, logs)
+
+

Called at the beginning of each batch.

+
+
on_batch_end(batch, logs)
+
+

Called at the end of each batch.

+
+
on_train_begin(logs)
+
+

Called at the beginning of training.

+
+
on_train_end(logs)
+
+

Called at the end of training.

+
+
+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/articles/training_visualization.html b/website/articles/training_visualization.html new file mode 100644 index 000000000..9b5505f40 --- /dev/null +++ b/website/articles/training_visualization.html @@ -0,0 +1,343 @@ + + + + + + + +Training Visualization • keras + + + + + + + +
+
+ + + +
+
+ + + + +
+
+

+Overview

+

There are a number of tools available for visualizing the training of Keras models, including:

+
    +
  1. A plot method for the Keras training history returned from fit().

  2. +
  3. Real time visualization of training metrics within the RStudio IDE.

  4. +
  5. Integration with the TensorBoard visualization tool included with TensorFlow. Beyond just training metrics, TensorBoard has a wide variety of other visualizations available including the underlying TensorFlow graph, gradient histograms, model weights, and more. TensorBoard also enables you to compare metrics across multiple training runs.

  6. +
+

Each of these tools is described in more detail below.

+
+
+

+Plotting History

+

The Keras fit() method returns an R object containing the training history, including the value of metrics at the end of each epoch . You can plot the training metrics by epoch using the plot() method.

+

For example, here we compile and fit a model with the “accuracy” metric:

+
model %>% compile(
+  loss = 'categorical_crossentropy',
+  optimizer = optimizer_rmsprop(),
+  metrics = c('accuracy')
+)
+
+history <- model %>% fit(
+  x_train, y_train, 
+  epochs = 30, batch_size = 128, 
+  validation_split = 0.2
+)
+

We can then plot the training history as follows:

+
plot(history)
+
+ +
+

The history will be plotted using ggplot2 if available (if not then base graphics will be used), include all specified metrics as well as the loss, and draw a smoothing line if there are 10 or more epochs. You can customize all of this behavior via various options of the plot method.

+

If you want to create a custom visualization you can call the as.data.frame() method on the history to obtain a data frame with factors for each metric as well as training vs. validation:

+
history_df <- as.data.frame(history)
+str(history_df)
+
'data.frame':   120 obs. of  4 variables:
+ $ epoch : int  1 2 3 4 5 6 7 8 9 10 ...
+ $ value : num  0.87 0.941 0.954 0.962 0.965 ...
+ $ metric: Factor w/ 2 levels "acc","loss": 1 1 1 1 1 1 1 1 1 1 ...
+ $ data  : Factor w/ 2 levels "training","validation": 1 1 1 1 1 1 1 1 1 1 ...
+
+
+

+RStudio IDE

+

If you are training your model within the RStudio IDE then real time metrics are available within the Viewer pane:

+ +

The view_metrics argument of the fit() method controls whether real time metrics are displayed. By default metrics are automatically displayed if one or more metrics are specified in the call to compile() and there is more than one training epoch.

+

You can explicitly control whether metrics are displayed by specifying the view_metrics argument. You can also set a global session default using the keras.view_metrics option:

+
# don't show metrics during this run
+history <- model %>% fit(
+  x_train, y_train, 
+  epochs = 30, batch_size = 128, 
+  view_metrics = FALSE,
+  validation_split = 0.2
+)
+
+# set global default to never show metrics
+options(keras.view_metrics = FALSE)
+

Note that when view_metrics is TRUE metrics will be displayed even when not running within RStudio (in that case metrics will be displayed in an external web browser).

+
+
+

+TensorBoard

+

TensorBoard is a visualization tool included with TensorFlow that enables you to visualize dynamic graphs of your Keras training and test metrics, as well as activation histograms for the different layers in your model.

+

For example, here’s a TensorBoard display for Keras accuracy and loss metrics:

+
+ +
+
+

+Recording Data

+

To record data that can be visualized with TensorBoard, you add a TensorBoard callback to the fit() function. For example:

+
history <- model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  callbacks = callback_tensorboard("logs/run_a"),
+  validation_split = 0.2
+)
+

See the documentation on the callback_tensorboard() function for the various available options. The most important option is the log_dir, which determines which directory logs are written to for a given training run.

+

You should either use a distinct log directory for each training run or remove the log directory between runs.

+
+
+

+Viewing Data

+

To view TensorBoard data for a given set of runs you use the tensorboard() function, pointing it to the previously specified log_dir:

+
tensorboard("logs/run_a")
+

It’s often useful to run TensorBoard while you are training a model. To do this, simply launch tensorboard within the training directory right before you begin training:

+
# launch TensorBoard (data won't show up until after the first epoch)
+tensorboard("logs/run_a")
+
+# fit the model with the TensorBoard callback
+history <- model %>% fit(
+  x_train, y_train,
+  batch_size = batch_size,
+  epochs = epochs,
+  verbose = 1,
+  callbacks = callback_tensorboard("logs/run_a"),
+  validation_split = 0.2
+)
+

Keras writes TensorBoard data at the end of each epoch so you won’t see any data in TensorBoard until 10-20 seconds after the end of the first epoch (TensorBoard automatically refreshes it’s display every 30 seconds during training).

+
+
+

+Comparing Runs

+

TensorBoard will automatically include all runs logged within the sub-directories of the specified log_dir, for example, if you logged another run using:

+
callback_tensorboard(log_dir = "logs/run_b")
+

Then called tensorboard as follows:

+
tensorboard("logs")
+

The TensorBoard visualization would look like this:

+
+ +
+

You can also pass multiple log directories. For example:

+
tensorboard(c("logs/run_a", "logs/run_b"))
+
+
+

+Customization

+
+

+Metrics

+

In the above examples TensorBoard metrics are logged for loss and accuracy. The TensorBoard callback will log data for any metrics which are specified in the metrics parameter of the compile() function. For example, in the following code:

+
model %>% compile(
+  loss = 'mean_squared_error',
+  optimizer = 'sgd',
+  metrics= c('mae', 'acc')
+)
+

TensorBoard data series will be created for the loss (mean squared error) as well as for the mean absolute error and accuracy metrics.

+
+
+

+Options

+

The callback_tensorboard() function includes a number of other options that control logging during training:

+
callback_tensorboard(log_dir = "logs", histogram_freq = 0,
+  write_graph = TRUE, write_images = FALSE, embeddings_freq = 0,
+  embeddings_layer_names = NULL, embeddings_metadata = NULL)
+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameDescription
log_dirPath of the directory to save the log files to be parsed by Tensorboard.
histogram_freqFrequency (in epochs) at which to compute activation histograms for the layers of the model. If set to 0 (the default), histograms won’t be computed.
write_graphWhether to visualize the graph in Tensorboard. The log file can become quite large when write_graph is set to TRUE +
write_imagesWhether to write model weights to visualize as image in Tensorboard.
embeddings_freqFrequency (in epochs) at which selected embedding layers will be saved.
embeddings_layer_namesA list of names of layers to keep eye on. If NULL or empty list all the embedding layers will be watched.
embeddings_metadataA named list which maps layer name to a file name in which metadata for this embedding layer is saved. See the details about the metadata file format. In case if the same metadata file is used for all embedding layers, string can be passed.
+
+
+
+
+
+ + + +
+ + + +
+ + + diff --git a/website/authors.html b/website/authors.html new file mode 100644 index 000000000..393f4b767 --- /dev/null +++ b/website/authors.html @@ -0,0 +1,183 @@ + + + + + + + + +Authors • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + +
+ +
+
+ + +
    +
  • +

    JJ Allaire. Author, maintainer. +

    +
  • +
  • +

    François Chollet. Author, copyright holder. +

    +
  • +
  • +

    . Copyright holder, funder. +

    +
  • +
  • +

    Google. Contributor, copyright holder, funder. +

    +
  • +
  • +

    Yuan Tang. Contributor, copyright holder. +

    +
  • +
  • +

    Daniel Falbel. Contributor, copyright holder. +

    +
  • +
  • +

    Wouter Van Der Bijl. Contributor, copyright holder. +

    +
  • +
  • +

    Martin Studer. Contributor, copyright holder. +

    +
  • +
+ +
+ +
+ + + +
+ + + diff --git a/website/extra.css b/website/extra.css new file mode 100644 index 000000000..8f5d096dc --- /dev/null +++ b/website/extra.css @@ -0,0 +1,64 @@ + +h4.date, +h4.author { + display: none; +} + +h2.hasAnchor { + font-weight: 350; +} + +.ref-index tbody { + margin-bottom: 60px; +} + +pre:not([class]) { + background-color: white; +} + +.contents a { + text-decoration: none; +} + +blockquote { + font-size: inherit; +} + +.examples .page-header { + border-bottom: none; + margin: 0; + padding-bottom: 0; +} + +.examples .sourceCode { + margin-top: 25px; +} + +#sidebar .nav>li>a { + padding-top: 1px; + padding-bottom: 2px; +} + +#installation .sourceCode { + font-size: 13px; +} + +.r-plot { + margin-top: 15px; + margin-bottom: 20px; + border: solid 1px #cccccc; +} + +.screenshot { + margin-bottom: 20px; + border: solid 1px #cccccc; +} + +.source-ref { + margin-bottom: 20px; +} + +.source-ref .caption { + display: none; +} + diff --git a/website/extra.js b/website/extra.js new file mode 100644 index 000000000..813c7f97f --- /dev/null +++ b/website/extra.js @@ -0,0 +1,16 @@ + +$(document).ready(function() { + + // turn functions section into ref-table + $('#functions').find('table').attr('class', 'ref-index'); + + // are we in examples? + var examples = window.location.href.match("/articles/examples/") !== null; + if (examples) { + $('.template-vignette').addClass('examples'); + + // remove right column + $(".col-md-9").removeClass("col-md-9").addClass('col-md-10'); + $(".col-md-3").remove(); + } +}); diff --git a/website/images/training_history_ggplot2.png b/website/images/training_history_ggplot2.png new file mode 100644 index 000000000..bd87f365d Binary files /dev/null and b/website/images/training_history_ggplot2.png differ diff --git a/website/jquery.sticky-kit.min.js b/website/jquery.sticky-kit.min.js new file mode 100644 index 000000000..e2a3c6de9 --- /dev/null +++ b/website/jquery.sticky-kit.min.js @@ -0,0 +1,9 @@ +/* + Sticky-kit v1.1.2 | WTFPL | Leaf Corcoran 2015 | http://leafo.net +*/ +(function(){var b,f;b=this.jQuery||window.jQuery;f=b(window);b.fn.stick_in_parent=function(d){var A,w,J,n,B,K,p,q,k,E,t;null==d&&(d={});t=d.sticky_class;B=d.inner_scrolling;E=d.recalc_every;k=d.parent;q=d.offset_top;p=d.spacer;w=d.bottoming;null==q&&(q=0);null==k&&(k=void 0);null==B&&(B=!0);null==t&&(t="is_stuck");A=b(document);null==w&&(w=!0);J=function(a,d,n,C,F,u,r,G){var v,H,m,D,I,c,g,x,y,z,h,l;if(!a.data("sticky_kit")){a.data("sticky_kit",!0);I=A.height();g=a.parent();null!=k&&(g=g.closest(k)); +if(!g.length)throw"failed to find stick parent";v=m=!1;(h=null!=p?p&&a.closest(p):b("
"))&&h.css("position",a.css("position"));x=function(){var c,f,e;if(!G&&(I=A.height(),c=parseInt(g.css("border-top-width"),10),f=parseInt(g.css("padding-top"),10),d=parseInt(g.css("padding-bottom"),10),n=g.offset().top+c+f,C=g.height(),m&&(v=m=!1,null==p&&(a.insertAfter(h),h.detach()),a.css({position:"",top:"",width:"",bottom:""}).removeClass(t),e=!0),F=a.offset().top-(parseInt(a.css("margin-top"),10)||0)-q, +u=a.outerHeight(!0),r=a.css("float"),h&&h.css({width:a.outerWidth(!0),height:u,display:a.css("display"),"vertical-align":a.css("vertical-align"),"float":r}),e))return l()};x();if(u!==C)return D=void 0,c=q,z=E,l=function(){var b,l,e,k;if(!G&&(e=!1,null!=z&&(--z,0>=z&&(z=E,x(),e=!0)),e||A.height()===I||x(),e=f.scrollTop(),null!=D&&(l=e-D),D=e,m?(w&&(k=e+u+c>C+n,v&&!k&&(v=!1,a.css({position:"fixed",bottom:"",top:c}).trigger("sticky_kit:unbottom"))),eb&&!v&&(c-=l,c=Math.max(b-u,c),c=Math.min(q,c),m&&a.css({top:c+"px"})))):e>F&&(m=!0,b={position:"fixed",top:c},b.width="border-box"===a.css("box-sizing")?a.outerWidth()+"px":a.width()+"px",a.css(b).addClass(t),null==p&&(a.after(h),"left"!==r&&"right"!==r||h.append(a)),a.trigger("sticky_kit:stick")),m&&w&&(null==k&&(k=e+u+c>C+n),!v&&k)))return v=!0,"static"===g.css("position")&&g.css({position:"relative"}), +a.css({position:"absolute",bottom:d,top:"auto"}).trigger("sticky_kit:bottom")},y=function(){x();return l()},H=function(){G=!0;f.off("touchmove",l);f.off("scroll",l);f.off("resize",y);b(document.body).off("sticky_kit:recalc",y);a.off("sticky_kit:detach",H);a.removeData("sticky_kit");a.css({position:"",bottom:"",top:"",width:""});g.position("position","");if(m)return null==p&&("left"!==r&&"right"!==r||a.insertAfter(h),h.remove()),a.removeClass(t)},f.on("touchmove",l),f.on("scroll",l),f.on("resize", +y),b(document.body).on("sticky_kit:recalc",y),a.on("sticky_kit:detach",H),setTimeout(l,0)}};n=0;for(K=this.length;n + + + + + diff --git a/website/pkgdown.css b/website/pkgdown.css new file mode 100644 index 000000000..209ce57fe --- /dev/null +++ b/website/pkgdown.css @@ -0,0 +1,163 @@ +/* Sticker footer */ +body > .container { + display: flex; + padding-top: 60px; + min-height: calc(100vh); + flex-direction: column; +} + +body > .container .row { + flex: 1; +} + +footer { + margin-top: 45px; + padding: 35px 0 36px; + border-top: 1px solid #e5e5e5; + color: #666; + display: flex; +} +footer p { + margin-bottom: 0; +} +footer div { + flex: 1; +} +footer .pkgdown { + text-align: right; +} +footer p { + margin-bottom: 0; +} + +img.icon { + float: right; +} + +img { + max-width: 100%; +} + +/* Section anchors ---------------------------------*/ + +a.anchor { + margin-left: -30px; + display:inline-block; + width: 30px; + height: 30px; + visibility: hidden; + + background-image: url(./link.svg); + background-repeat: no-repeat; + background-size: 20px 20px; + background-position: center center; +} + +.hasAnchor:hover a.anchor { + visibility: visible; +} + +@media (max-width: 767px) { + .hasAnchor:hover a.anchor { + visibility: hidden; + } +} + + +/* Fixes for fixed navbar --------------------------*/ + +.contents h1, .contents h2, .contents h3, .contents h4 { + padding-top: 60px; + margin-top: -60px; +} + +/* Static header placement on mobile devices */ +@media (max-width: 767px) { + .navbar-fixed-top { + position: absolute; + } + .navbar { + padding: 0; + } +} + + +/* Sidebar --------------------------*/ + +#sidebar { + margin-top: 30px; +} +#sidebar h2 { + font-size: 1.5em; + margin-top: 1em; +} + +#sidebar h2:first-child { + margin-top: 0; +} + +#sidebar .list-unstyled li { + margin-bottom: 0.5em; +} + +/* Reference index & topics ----------------------------------------------- */ + +.ref-index th {font-weight: normal;} +.ref-index h2 {font-size: 20px;} + +.ref-index td {vertical-align: top;} +.ref-index .alias {width: 40%;} +.ref-index .title {width: 60%;} + +.ref-index .alias {width: 40%;} +.ref-index .title {width: 60%;} + +.ref-arguments th {text-align: right; padding-right: 10px;} +.ref-arguments th, .ref-arguments td {vertical-align: top;} +.ref-arguments .name {width: 20%;} +.ref-arguments .desc {width: 80%;} + +/* Nice scrolling for wide elements --------------------------------------- */ + +table { + display: block; + overflow: auto; +} + +/* Syntax highlighting ---------------------------------------------------- */ + +pre { + word-wrap: normal; + word-break: normal; + border: 1px solid #eee; +} + +pre, code { + background-color: #f8f8f8; + color: #333; +} + +pre .img { + margin: 5px 0; +} + +pre .img img { + background-color: #fff; + display: block; + height: auto; +} + +code a, pre a { + color: #375f84; +} + +.fl {color: #1514b5;} +.fu {color: #000000;} /* function */ +.ch,.st {color: #036a07;} /* string */ +.kw {color: #264D66;} /* keyword */ +.co {color: #888888;} /* comment */ + +.message { color: black; font-weight: bolder;} +.error { color: orange; font-weight: bolder;} +.warning { color: #6A0366; font-weight: bolder;} + diff --git a/website/pkgdown.js b/website/pkgdown.js new file mode 100644 index 000000000..4b8171328 --- /dev/null +++ b/website/pkgdown.js @@ -0,0 +1,45 @@ +$(function() { + $("#sidebar").stick_in_parent({offset_top: 40}); + $('body').scrollspy({ + target: '#sidebar', + offset: 60 + }); + + var cur_path = paths(location.pathname); + $("#navbar ul li a").each(function(index, value) { + if (value.text == "Home") + return; + if (value.getAttribute("href") === "#") + return; + + var path = paths(value.pathname); + if (is_prefix(cur_path, path)) { + // Add class to parent
  • , and enclosing
  • if in dropdown + var menu_anchor = $(value); + menu_anchor.parent().addClass("active"); + menu_anchor.closest("li.dropdown").addClass("active"); + } + }); +}); + +function paths(pathname) { + var pieces = pathname.split("/"); + pieces.shift(); // always starts with / + + var end = pieces[pieces.length - 1]; + if (end === "index.html" || end === "") + pieces.pop(); + return(pieces); +} + +function is_prefix(needle, haystack) { + if (needle.length > haystack.lengh) + return(false); + + for (var i = 0; i < haystack.length; i++) { + if (needle[i] != haystack[i]) + return(false); + } + + return(true); +} diff --git a/website/reference/KerasCallback.html b/website/reference/KerasCallback.html new file mode 100644 index 000000000..206767fac --- /dev/null +++ b/website/reference/KerasCallback.html @@ -0,0 +1,227 @@ + + + + + + + + +Base R6 class for Keras callbacks — KerasCallback • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Base R6 class for Keras callbacks

    + + +
    KerasCallback
    + +

    Format

    + +

    An R6Class generator object

    + +

    Value

    + +

    KerasCallback.

    + +

    Details

    + +

    The logs named list that callback methods take as argument will +contain keys for quantities relevant to the current batch or epoch.

    +

    Currently, the fit() method for sequential models will include the following quantities in the logs that +it passes to its callbacks:

      +
    • on_epoch_end: logs include acc and loss, and optionally include val_loss (if validation is enabled in fit), and val_acc (if validation and accuracy monitoring are enabled).

    • +
    • on_batch_begin: logs include size, the number of samples in the current batch.

    • +
    • on_batch_end: logs include loss, and optionally acc (if accuracy monitoring is enabled).

    • +
    + +

    Fields

    + + +
    +
    params

    Named list with training parameters (eg. verbosity, batch size, number of epochs...).

    +
    model

    Reference to the Keras model being trained.

    +
    + +

    Methods

    + + +
    +
    on_epoch_begin(epoch, logs)

    Called at the beginning of each epoch.

    +
    on_epoch_end(epoch, logs)

    Called at the end of each epoch.

    +
    on_batch_begin(batch, logs)

    Called at the beginning of each batch.

    +
    on_batch_end(batch, logs)

    Called at the end of each batch.

    +
    on_train_begin(logs)

    Called at the beginning of training.

    +
    on_train_end(logs)

    Called at the end of training.

    +
    + + +

    Examples

    +
    # NOT RUN {
    +library(keras)
    +
    +LossHistory <- R6::R6Class("LossHistory",
    +  inherit = KerasCallback,
    +
    +  public = list(
    +
    +    losses = NULL,
    +
    +    on_batch_end = function(batch, logs = list()) {
    +      self$losses <- c(self$losses, logs[["loss"]])
    +    }
    +  )
    +)
    +# }
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/KerasLayer.html b/website/reference/KerasLayer.html new file mode 100644 index 000000000..abf11adcf --- /dev/null +++ b/website/reference/KerasLayer.html @@ -0,0 +1,184 @@ + + + + + + + + +Base R6 class for Keras layers — KerasLayer • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Base R6 class for Keras layers

    + + +
    KerasLayer
    + +

    Format

    + +

    An R6Class generator object #'

    + +

    Value

    + +

    KerasLayer.

    + +

    Methods

    + +

    +
    build(input_shape)

    Creates the +layer weights (must be implemented by all layers that have weights)

    +
    call(inputs,mask)

    Call the layer on an input tensor.

    +
    compute_output_shape(input_shape)

    Compute the output shape +for the layer.

    +
    add_weight(name,shape,dtype,initializer,regularizer,trainable,constraint)

    Adds +a weight variable to the layer.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/activation_relu.html b/website/reference/activation_relu.html new file mode 100644 index 000000000..8cf1374aa --- /dev/null +++ b/website/reference/activation_relu.html @@ -0,0 +1,209 @@ + + + + + + + + +Activation functions — activation_relu • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Activations functions can either be used through layer_activation(), or +through the activation argument supported by all forward layers.

    + + +
    activation_relu(x, alpha = 0, max_value = NULL)
    +
    +activation_elu(x, alpha = 1)
    +
    +activation_selu(x)
    +
    +activation_hard_sigmoid(x)
    +
    +activation_linear(x)
    +
    +activation_sigmoid(x)
    +
    +activation_softmax(x, axis = -1)
    +
    +activation_softplus(x)
    +
    +activation_softsign(x)
    +
    +activation_tanh(x)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    x

    Tensor

    alpha

    Alpha value

    max_value

    Max value

    axis

    Integer, axis along which the softmax normalization is applied

    + +

    References

    + + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/application_inception_v3.html b/website/reference/application_inception_v3.html new file mode 100644 index 000000000..9176cda0d --- /dev/null +++ b/website/reference/application_inception_v3.html @@ -0,0 +1,237 @@ + + + + + + + + +Inception V3 model, with weights pre-trained on ImageNet. — application_inception_v3 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Inception V3 model, with weights pre-trained on ImageNet.

    + + +
    application_inception_v3(include_top = TRUE, weights = "imagenet",
    +  input_tensor = NULL, input_shape = NULL, pooling = NULL,
    +  classes = 1000)
    +
    +inception_v3_preprocess_input(x)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    include_top

    whether to include the fully-connected layer at the top of +the network.

    weights

    one of NULL (random initialization) or "imagenet" +(pre-training on ImageNet).

    input_tensor

    optional Keras tensor to use as image input for the +model.

    input_shape

    optional shape list, only to be specified if include_top +is FALSE (otherwise the input shape has to be (299, 299, 3). It should +have exactly 3 inputs channels, and width and height should be no smaller +than 71. E.g. (150, 150, 3) would be one valid value.

    pooling

    Optional pooling mode for feature extraction when +include_top is FALSE.

      +
    • NULL means that the output of the model will be the 4D tensor output +of the last convolutional layer.

    • +
    • avg means that global average pooling will be applied to the output of +the last convolutional layer, and thus the output of the model will be +a 2D tensor.

    • +
    • max means that global max pooling will be applied.

    • +
    classes

    optional number of classes to classify images into, only to be +specified if include_top is TRUE, and if no weights argument is +specified.

    x

    Input tensor for preprocessing

    + +

    Value

    + +

    A Keras model instance.

    + +

    Details

    + +

    Do note that the input image format for this model is different than for +the VGG16 and ResNet models (299x299 instead of 224x224).

    +

    The inception_v3_preprocess_input() function should be used for image +preprocessing.

    + +

    Reference

    + + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/application_mobilenet.html b/website/reference/application_mobilenet.html new file mode 100644 index 000000000..e7bea0182 --- /dev/null +++ b/website/reference/application_mobilenet.html @@ -0,0 +1,277 @@ + + + + + + + + +MobileNet model architecture. — application_mobilenet • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    MobileNet model architecture.

    + + +
    application_mobilenet(input_shape = NULL, alpha = 1, depth_multiplier = 1,
    +  dropout = 0.001, include_top = TRUE, weights = "imagenet",
    +  input_tensor = NULL, pooling = NULL, classes = 1000)
    +
    +mobilenet_preprocess_input(x)
    +
    +mobilenet_decode_predictions(preds, top = 5)
    +
    +mobilenet_load_model_hdf5(filepath)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    input_shape

    optional shape list, only to be specified if include_top +is FALSE (otherwise the input shape has to be (224, 224, 3) (with +channels_last data format) or (3, 224, 224) (with channels_first data +format). It should have exactly 3 inputs channels, and width and height +should be no smaller than 32. E.g. (200, 200, 3) would be one valid +value.

    alpha

    controls the width of the network.

      +
    • If alpha < 1.0, proportionally decreases the number of filters in each layer.

    • +
    • If alpha > 1.0, proportionally increases the number of filters in each layer.

    • +
    • If alpha = 1, default number of filters from the paper are used at each layer.

    • +
    depth_multiplier

    depth multiplier for depthwise convolution (also +called the resolution multiplier)

    dropout

    dropout rate

    include_top

    whether to include the fully-connected layer at the top of +the network.

    weights

    NULL (random initialization) or imagenet (ImageNet +weights)

    input_tensor

    optional Keras tensor (i.e. output of layers.Input()) +to use as image input for the model.

    pooling

    Optional pooling mode for feature extraction when +include_top is FALSE. +- NULL means that the output of the model will be the 4D tensor output +of the last convolutional layer. +- avg means that global average pooling will be applied to the output +of the last convolutional layer, and thus the output of the model will +be a 2D tensor. +- max means that global max pooling will be applied.

    classes

    optional number of classes to classify images into, only to be +specified if include_top is TRUE, and if no weights argument is +specified.

    x

    input tensor, 4D

    preds

    Tensor encoding a batch of predictions.

    top

    integer, how many top-guesses to return.

    filepath

    File path

    + +

    Value

    + +

    application_mobilenet() and mobilenet_load_model_hdf5() return a +Keras model instance. mobilenet_preprocess_input() returns image input +suitable for feeding into a mobilenet model. mobilenet_decode_predictions() +returns a list of data frames with variables class_name, class_description, +and score (one data frame per sample in batch input).

    + +

    Details

    + +

    The mobilenet_preprocess_input() function should be used for image +preprocessing. To load a saved instance of a MobileNet model use +the mobilenet_load_model_hdf5() function. To prepare image input +for MobileNet use mobilenet_preprocess_input(). To decode +predictions use mobilenet_decode_predictions().

    +

    MobileNet is currently only supported with the TensorFlow backend.

    + +

    Reference

    + + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/application_resnet50.html b/website/reference/application_resnet50.html new file mode 100644 index 000000000..5085616c2 --- /dev/null +++ b/website/reference/application_resnet50.html @@ -0,0 +1,250 @@ + + + + + + + + +ResNet50 model for Keras. — application_resnet50 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    ResNet50 model for Keras.

    + + +
    application_resnet50(include_top = TRUE, weights = "imagenet",
    +  input_tensor = NULL, input_shape = NULL, pooling = NULL,
    +  classes = 1000)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    include_top

    whether to include the fully-connected layer at the top of +the network.

    weights

    one of NULL (random initialization) or "imagenet" +(pre-training on ImageNet).

    input_tensor

    optional Keras tensor to use as image input for the +model.

    input_shape

    optional shape list, only to be specified if include_top +is FALSE (otherwise the input shape has to be (224, 224, 3). It should +have exactly 3 inputs channels, and width and height should be no smaller +than 197. E.g. (200, 200, 3) would be one valid value.

    pooling

    Optional pooling mode for feature extraction when +include_top is FALSE.

      +
    • NULL means that the output of the model will be the 4D tensor output +of the last convolutional layer.

    • +
    • avg means that global average pooling will be applied to the output of +the last convolutional layer, and thus the output of the model will be +a 2D tensor.

    • +
    • max means that global max pooling will be applied.

    • +
    classes

    optional number of classes to classify images into, only to be +specified if include_top is TRUE, and if no weights argument is +specified.

    + +

    Value

    + +

    A Keras model instance.

    + +

    Details

    + +

    Optionally loads weights pre-trained on ImageNet.

    +

    The imagenet_preprocess_input() function should be used for image +preprocessing.

    + +

    Reference

    + +

    - Deep Residual Learning for ImageRecognition

    + + +

    Examples

    +
    # NOT RUN {
    +library(keras)
    +
    +# instantiate the model
    +model <- application_resnet50(weights = 'imagenet')
    +
    +# load the image
    +img_path <- "elephant.jpg"
    +img <- image_load(img_path, target_size = c(224,224))
    +x <- image_to_array(img)
    +
    +# ensure we have a 4d tensor with single element in the batch dimension,
    +# the preprocess the input for prediction using resnet50
    +dim(x) <- c(1, dim(x))
    +x <- imagenet_preprocess_input(x)
    +
    +# make predictions then decode and print them
    +preds <- model %>% predict(x)
    +imagenet_decode_predictions(preds, top = 3)[[1]]
    +# }
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/application_vgg.html b/website/reference/application_vgg.html new file mode 100644 index 000000000..417607a24 --- /dev/null +++ b/website/reference/application_vgg.html @@ -0,0 +1,246 @@ + + + + + + + + +VGG16 and VGG19 models for Keras. — application_vgg • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    VGG16 and VGG19 models for Keras.

    + + +
    application_vgg16(include_top = TRUE, weights = "imagenet",
    +  input_tensor = NULL, input_shape = NULL, pooling = NULL,
    +  classes = 1000)
    +
    +application_vgg19(include_top = TRUE, weights = "imagenet",
    +  input_tensor = NULL, input_shape = NULL, pooling = NULL,
    +  classes = 1000)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    include_top

    whether to include the 3 fully-connected layers at the top +of the network.

    weights

    one of NULL (random initialization) or "imagenet" +(pre-training on ImageNet).

    input_tensor

    optional Keras tensor to use as image input for the +model.

    input_shape

    optional shape list, only to be specified if include_top +is FALSE (otherwise the input shape has to be (224, 224, 3) It should +have exactly 3 inputs channels, and width and height should be no smaller +than 48. E.g. (200, 200, 3) would be one valid value.

    pooling

    Optional pooling mode for feature extraction when +include_top is FALSE.

      +
    • NULL means that the output of the model will be the 4D tensor output +of the last convolutional layer.

    • +
    • avg means that global average pooling will be applied to the output of +the last convolutional layer, and thus the output of the model will be +a 2D tensor.

    • +
    • max means that global max pooling will be applied.

    • +
    classes

    optional number of classes to classify images into, only to be +specified if include_top is TRUE, and if no weights argument is +specified.

    + +

    Value

    + +

    Keras model instance.

    + +

    Details

    + +

    Optionally loads weights pre-trained on ImageNet.

    +

    The imagenet_preprocess_input() function should be used for image preprocessing.

    + +

    Reference

    + +

    - Very Deep Convolutional Networks for Large-Scale ImageRecognition

    + + +

    Examples

    +
    # NOT RUN {
    +library(keras)
    +
    +model <- application_vgg16(weights = 'imagenet', include_top = FALSE)
    +
    +img_path <- "elephant.jpg"
    +img <- image_load(img_path, target_size = c(224,224))
    +x <- image_to_array(img)
    +dim(x) <- c(1, dim(x))
    +x <- imagenet_preprocess_input(x)
    +
    +features <- model %>% predict(x)
    +# }
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/application_xception.html b/website/reference/application_xception.html new file mode 100644 index 000000000..9db60c83f --- /dev/null +++ b/website/reference/application_xception.html @@ -0,0 +1,240 @@ + + + + + + + + +Xception V1 model for Keras. — application_xception • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Xception V1 model for Keras.

    + + +
    application_xception(include_top = TRUE, weights = "imagenet",
    +  input_tensor = NULL, input_shape = NULL, pooling = NULL,
    +  classes = 1000)
    +
    +xception_preprocess_input(x)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    include_top

    whether to include the fully-connected layer at the top of +the network.

    weights

    one of NULL (random initialization) or "imagenet" +(pre-training on ImageNet).

    input_tensor

    optional Keras tensor to use as image input for the +model.

    input_shape

    optional shape list, only to be specified if include_top +is FALSE (otherwise the input shape has to be (299, 299, 3). It should +have exactly 3 inputs channels, and width and height should be no smaller +than 71. E.g. (150, 150, 3) would be one valid value.

    pooling

    Optional pooling mode for feature extraction when +include_top is FALSE.

      +
    • NULL means that the output of the model will be the 4D tensor output +of the last convolutional layer.

    • +
    • avg means that global average pooling will be applied to the output of +the last convolutional layer, and thus the output of the model will be +a 2D tensor.

    • +
    • max means that global max pooling will be applied.

    • +
    classes

    optional number of classes to classify images into, only to be +specified if include_top is TRUE, and if no weights argument is +specified.

    x

    Input tensor for preprocessing

    + +

    Value

    + +

    A Keras model instance.

    + +

    Details

    + +

    On ImageNet, this model gets to a top-1 validation accuracy of 0.790 +and a top-5 validation accuracy of 0.945.

    +

    Do note that the input image format for this model is different than for +the VGG16 and ResNet models (299x299 instead of 224x224).

    +

    The xception_preprocess_input() function should be used for image +preprocessing.

    +

    This application is only available when using the TensorFlow back-end.

    + +

    Reference

    + + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/backend.html b/website/reference/backend.html new file mode 100644 index 000000000..82deb7693 --- /dev/null +++ b/website/reference/backend.html @@ -0,0 +1,185 @@ + + + + + + + + +Keras backend tensor engine — backend • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Obtain a reference to the keras.backend Python module used to implement +tensor operations.

    + + +
    backend(convert = TRUE)
    + +

    Arguments

    + + + + + + +
    convert

    TRUE to automatically convert Python objects to their R +equivalent. If you pass FALSE you can do manual conversion using the +py_to_r() function.

    + +

    Value

    + +

    Reference to Keras backend python module.

    + +

    Note

    + +

    See the documentation here https://keras.io/backend/ for +additional details on the available functions.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/bidirectional.html b/website/reference/bidirectional.html new file mode 100644 index 000000000..b3764f9ab --- /dev/null +++ b/website/reference/bidirectional.html @@ -0,0 +1,223 @@ + + + + + + + + +Bidirectional wrapper for RNNs. — bidirectional • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Bidirectional wrapper for RNNs.

    + + +
    bidirectional(object, layer, merge_mode = "concat", input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    layer

    Recurrent instance.

    merge_mode

    Mode by which outputs of the forward and backward RNNs will +be combined. One of 'sum', 'mul', 'concat', 'ave', NULL. If NULL, the +outputs will not be combined, they will be returned as a list.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Other layer wrappers: time_distributed

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_csv_logger.html b/website/reference/callback_csv_logger.html new file mode 100644 index 000000000..86a921189 --- /dev/null +++ b/website/reference/callback_csv_logger.html @@ -0,0 +1,192 @@ + + + + + + + + +Callback that streams epoch results to a csv file — callback_csv_logger • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Supports all values that can be represented as a string

    + + +
    callback_csv_logger(filename, separator = ",", append = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    filename

    filename of the csv file, e.g. 'run/log.csv'.

    separator

    string used to separate elements in the csv file.

    append

    TRUE: append if file exists (useful for continuing training). +FALSE: overwrite existing file,

    + +

    See also

    + +

    Other callbacks: callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_early_stopping.html b/website/reference/callback_early_stopping.html new file mode 100644 index 000000000..bc5df8749 --- /dev/null +++ b/website/reference/callback_early_stopping.html @@ -0,0 +1,207 @@ + + + + + + + + +Stop training when a monitored quantity has stopped improving. — callback_early_stopping • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Stop training when a monitored quantity has stopped improving.

    + + +
    callback_early_stopping(monitor = "val_loss", min_delta = 0, patience = 0,
    +  verbose = 0, mode = c("auto", "min", "max"))
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    monitor

    quantity to be monitored.

    min_delta

    minimum change in the monitored quantity to qualify as an +improvement, i.e. an absolute change of less than min_delta, will count as +no improvement.

    patience

    number of epochs with no improvement after which training +will be stopped.

    verbose

    verbosity mode, 0 or 1.

    mode

    one of "auto", "min", "max". In min mode, training will stop when +the quantity monitored has stopped decreasing; in max mode it will stop +when the quantity monitored has stopped increasing; in auto mode, the +direction is automatically inferred from the name of the monitored +quantity.

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_lambda.html b/website/reference/callback_lambda.html new file mode 100644 index 000000000..b5ba5c6de --- /dev/null +++ b/website/reference/callback_lambda.html @@ -0,0 +1,211 @@ + + + + + + + + +Create a custom callback — callback_lambda • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This callback is constructed with anonymous functions that will be called at +the appropriate time. Note that the callbacks expects positional arguments, +as:

      +
    • on_epoch_begin and on_epoch_end expect two positional arguments: epoch, logs

    • +
    • on_batch_begin and on_batch_end expect two positional arguments: batch, logs

    • +
    • on_train_begin and on_train_end expect one positional argument: logs

    • +
    + + +
    callback_lambda(on_epoch_begin = NULL, on_epoch_end = NULL,
    +  on_batch_begin = NULL, on_batch_end = NULL, on_train_begin = NULL,
    +  on_train_end = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    on_epoch_begin

    called at the beginning of every epoch.

    on_epoch_end

    called at the end of every epoch.

    on_batch_begin

    called at the beginning of every batch.

    on_batch_end

    called at the end of every batch.

    on_train_begin

    called at the beginning of model training.

    on_train_end

    called at the end of model training.

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_learning_rate_scheduler.html b/website/reference/callback_learning_rate_scheduler.html new file mode 100644 index 000000000..d903ad6eb --- /dev/null +++ b/website/reference/callback_learning_rate_scheduler.html @@ -0,0 +1,184 @@ + + + + + + + + +Learning rate scheduler. — callback_learning_rate_scheduler • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Learning rate scheduler.

    + + +
    callback_learning_rate_scheduler(schedule)
    + +

    Arguments

    + + + + + + +
    schedule

    a function that takes an epoch index as input (integer, +indexed from 0) and returns a new learning rate as output (float).

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_model_checkpoint.html b/website/reference/callback_model_checkpoint.html new file mode 100644 index 000000000..151aeed8f --- /dev/null +++ b/website/reference/callback_model_checkpoint.html @@ -0,0 +1,228 @@ + + + + + + + + +Save the model after every epoch. — callback_model_checkpoint • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    filepath can contain named formatting options, which will be filled the +value of epoch and keys in logs (passed in on_epoch_end). For example: +if filepath is weights.{epoch:02d}-{val_loss:.2f}.hdf5, then the model +checkpoints will be saved with the epoch number and the validation loss in +the filename.

    + + +
    callback_model_checkpoint(filepath, monitor = "val_loss", verbose = 0,
    +  save_best_only = FALSE, save_weights_only = FALSE, mode = c("auto",
    +  "min", "max"), period = 1)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    filepath

    string, path to save the model file.

    monitor

    quantity to monitor.

    verbose

    verbosity mode, 0 or 1.

    save_best_only

    if save_best_only=TRUE, the latest best model +according to the quantity monitored will not be overwritten.

    save_weights_only

    if TRUE, then only the model's weights will be +saved (save_model_weights_hdf5(filepath)), else the full model is saved +(save_model_hdf5(filepath)).

    mode

    one of "auto", "min", "max". If save_best_only=TRUE, the decision to +overwrite the current save file is made based on either the maximization or +the minimization of the monitored quantity. For val_acc, this should be +max, for val_loss this should be min, etc. In auto mode, the direction is +automatically inferred from the name of the monitored quantity.

    period

    Interval (number of epochs) between checkpoints.

    + +

    For example

    + +

    if filepath is +weights.{epoch:02d}-{val_loss:.2f}.hdf5,: then the model checkpoints will +be saved with the epoch number and the validation loss in the filename.

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_progbar_logger.html b/website/reference/callback_progbar_logger.html new file mode 100644 index 000000000..2774b56f5 --- /dev/null +++ b/website/reference/callback_progbar_logger.html @@ -0,0 +1,184 @@ + + + + + + + + +Callback that prints metrics to stdout. — callback_progbar_logger • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Callback that prints metrics to stdout.

    + + +
    callback_progbar_logger(count_mode = "samples")
    + +

    Arguments

    + + + + + + +
    count_mode

    One of "steps" or "samples". Whether the progress bar +should count samples seens or steps (batches) seen.

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_reduce_lr_on_plateau.html b/website/reference/callback_reduce_lr_on_plateau.html new file mode 100644 index 000000000..5991c08fa --- /dev/null +++ b/website/reference/callback_reduce_lr_on_plateau.html @@ -0,0 +1,224 @@ + + + + + + + + +Reduce learning rate when a metric has stopped improving. — callback_reduce_lr_on_plateau • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Models often benefit from reducing the learning rate by a factor of 2-10 once +learning stagnates. This callback monitors a quantity and if no improvement +is seen for a 'patience' number of epochs, the learning rate is reduced.

    + + +
    callback_reduce_lr_on_plateau(monitor = "val_loss", factor = 0.1,
    +  patience = 10, verbose = 0, mode = c("auto", "min", "max"),
    +  epsilon = 1e-04, cooldown = 0, min_lr = 0)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    monitor

    quantity to be monitored.

    factor

    factor by which the learning rate will be reduced. new_lr = lr

      +
    • factor

    • +
    patience

    number of epochs with no improvement after which learning +rate will be reduced.

    verbose

    int. 0: quiet, 1: update messages.

    mode

    one of "auto", "min", "max". In min mode, lr will be reduced when +the quantity monitored has stopped decreasing; in max mode it will be +reduced when the quantity monitored has stopped increasing; in auto mode, +the direction is automatically inferred from the name of the monitored +quantity.

    epsilon

    threshold for measuring the new optimum, to only focus on +significant changes.

    cooldown

    number of epochs to wait before resuming normal operation +after lr has been reduced.

    min_lr

    lower bound on the learning rate.

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_remote_monitor, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_remote_monitor.html b/website/reference/callback_remote_monitor.html new file mode 100644 index 000000000..854f8cf10 --- /dev/null +++ b/website/reference/callback_remote_monitor.html @@ -0,0 +1,197 @@ + + + + + + + + +Callback used to stream events to a server. — callback_remote_monitor • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Callback used to stream events to a server.

    + + +
    callback_remote_monitor(root = "http://localhost:9000",
    +  path = "/publish/epoch/end/", field = "data", headers = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    root

    root url of the target server.

    path

    path relative to root to which the events will be sent.

    field

    JSON field under which the data will be stored.

    headers

    Optional named list of custom HTTP headers. Defaults to: +list(Accept = "application/json",Content-Type= "application/json")

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_tensorboard, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_tensorboard.html b/website/reference/callback_tensorboard.html new file mode 100644 index 000000000..1aa8f58b1 --- /dev/null +++ b/website/reference/callback_tensorboard.html @@ -0,0 +1,242 @@ + + + + + + + + +TensorBoard basic visualizations — callback_tensorboard • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This callback writes a log for TensorBoard, which allows you to visualize +dynamic graphs of your training and test metrics, as well as activation +histograms for the different layers in your model.

    + + +
    callback_tensorboard(log_dir = NULL, histogram_freq = 0, batch_size = 32,
    +  write_graph = TRUE, write_grads = FALSE, write_images = FALSE,
    +  embeddings_freq = 0, embeddings_layer_names = NULL,
    +  embeddings_metadata = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    log_dir

    The path of the directory where to save the log files to be +parsed by Tensorboard. The default is NULL, which will use the active +run directory (if available) and otherwise will use "logs".

    histogram_freq

    frequency (in epochs) at which to compute activation +histograms for the layers of the model. If set to 0, histograms won't be +computed.

    batch_size

    size of batch of inputs to feed to the network +for histograms computation.

    write_graph

    whether to visualize the graph in Tensorboard. The log +file can become quite large when write_graph is set to TRUE

    write_grads

    whether to visualize gradient histograms in TensorBoard. +histogram_freq must be greater than 0.

    write_images

    whether to write model weights to visualize as image in +Tensorboard.

    embeddings_freq

    frequency (in epochs) at which selected embedding +layers will be saved.

    embeddings_layer_names

    a list of names of layers to keep eye on. If +NULL or empty list all the embedding layers will be watched.

    embeddings_metadata

    a named list which maps layer name to a file name in +which metadata for this embedding layer is saved. See the +details +about the metadata file format. In case if the same metadata file is used +for all embedding layers, string can be passed.

    + +

    Details

    + +

    TensorBoard is a visualization tool provided with TensorFlow.

    +

    You can find more information about TensorBoard +here.

    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_terminate_on_naan

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/callback_terminate_on_naan.html b/website/reference/callback_terminate_on_naan.html new file mode 100644 index 000000000..9d713f225 --- /dev/null +++ b/website/reference/callback_terminate_on_naan.html @@ -0,0 +1,173 @@ + + + + + + + + +Callback that terminates training when a NaN loss is encountered. — callback_terminate_on_naan • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Callback that terminates training when a NaN loss is encountered.

    + + +
    callback_terminate_on_naan()
    + +

    See also

    + +

    Other callbacks: callback_csv_logger, + callback_early_stopping, + callback_lambda, + callback_learning_rate_scheduler, + callback_model_checkpoint, + callback_progbar_logger, + callback_reduce_lr_on_plateau, + callback_remote_monitor, + callback_tensorboard

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/compile.html b/website/reference/compile.html new file mode 100644 index 000000000..8c2b45dcd --- /dev/null +++ b/website/reference/compile.html @@ -0,0 +1,223 @@ + + + + + + + + +Configure a Keras model for training — compile • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Configure a Keras model for training

    + + +
    compile(object, optimizer, loss, metrics = NULL, loss_weights = NULL,
    +  sample_weight_mode = NULL, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model object to compile.

    optimizer

    Name of optimizer or optimizer object.

    loss

    Name of objective function or objective function. If the model +has multiple outputs, you can use a different loss on each output by +passing a dictionary or a list of objectives. The loss value that will be +minimized by the model will then be the sum of all individual losses.

    metrics

    List of metrics to be evaluated by the model during training +and testing. Typically you will use metrics='accuracy'. To specify +different metrics for different outputs of a multi-output model, you could +also pass a named list such as metrics=list(output_a = 'accuracy').

    loss_weights

    Optional list specifying scalar coefficients to weight +the loss contributions of different model outputs. The loss value that will +be minimized by the model will then be the weighted sum of all indvidual +losses, weighted by the loss_weights coefficients.

    sample_weight_mode

    If you need to do timestep-wise sample weighting +(2D weights), set this to "temporal". NULL defaults to sample-wise +weights (1D). If the model has multiple outputs, you can use a different +sample_weight_mode on each output by passing a list of modes.

    ...

    Additional named arguments passed to tf$Session$run.

    + +

    See also

    + +

    Other model functions: evaluate_generator, + evaluate, fit_generator, + fit, get_config, + get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/constraint_maxnorm.html b/website/reference/constraint_maxnorm.html new file mode 100644 index 000000000..d928ce210 --- /dev/null +++ b/website/reference/constraint_maxnorm.html @@ -0,0 +1,188 @@ + + + + + + + + +MaxNorm weight constraint — constraint_maxnorm • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Constrains the weights incident to each hidden unit to have a norm less than +or equal to a desired value.

    + + +
    constraint_maxnorm(max_value = 2, axis = 0)
    + +

    Arguments

    + + + + + + + + + + +
    max_value

    The maximum norm for the incoming weights.

    axis

    The axis along which to calculate weight norms. For instance, in +a dense layer the weight matrix has shape input_dim, output_dim, +set axis to 0 to constrain each weight vector of length input_dim,. +In a convolution 2D layer with dim_ordering="tf", the weight tensor has +shape rows, cols, input_depth, output_depth, set axis to c(0, 1, 2) +to constrain the weights of each filter tensor of size rows, cols, input_depth.

    + +

    See also

    + +

    Dropout: A Simple Way to Prevent Neural Networks from Overfitting Srivastava, Hinton, et al. 2014

    +

    Other constraints: constraint_minmaxnorm, + constraint_nonneg, + constraint_unitnorm

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/constraint_minmaxnorm.html b/website/reference/constraint_minmaxnorm.html new file mode 100644 index 000000000..3a7dd4a18 --- /dev/null +++ b/website/reference/constraint_minmaxnorm.html @@ -0,0 +1,199 @@ + + + + + + + + +MinMaxNorm weight constraint — constraint_minmaxnorm • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Constrains the weights incident to each hidden unit to have the norm between +a lower bound and an upper bound.

    + + +
    constraint_minmaxnorm(min_value = 0, max_value = 1, rate = 1, axis = 0)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    min_value

    The minimum norm for the incoming weights.

    max_value

    The maximum norm for the incoming weights.

    rate

    The rate for enforcing the constraint: weights will be rescaled to +yield (1 - rate) * norm + rate * norm.clip(low, high). Effectively, this +means that rate=1.0 stands for strict enforcement of the constraint, while +rate<1.0 means that weights will be rescaled at each step to slowly move +towards a value inside the desired interval.

    axis

    The axis along which to calculate weight norms. For instance, in +a dense layer the weight matrix has shape input_dim, output_dim, +set axis to 0 to constrain each weight vector of length input_dim,. +In a convolution 2D layer with dim_ordering="tf", the weight tensor has +shape rows, cols, input_depth, output_depth, set axis to c(0, 1, 2) +to constrain the weights of each filter tensor of size rows, cols, input_depth.

    + +

    See also

    + +

    Other constraints: constraint_maxnorm, + constraint_nonneg, + constraint_unitnorm

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/constraint_nonneg.html b/website/reference/constraint_nonneg.html new file mode 100644 index 000000000..3c9a9978c --- /dev/null +++ b/website/reference/constraint_nonneg.html @@ -0,0 +1,167 @@ + + + + + + + + +NonNeg weight constraint — constraint_nonneg • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Constrains the weights to be non-negative.

    + + +
    constraint_nonneg()
    + +

    See also

    + +

    Other constraints: constraint_maxnorm, + constraint_minmaxnorm, + constraint_unitnorm

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/constraint_unitnorm.html b/website/reference/constraint_unitnorm.html new file mode 100644 index 000000000..52f498c5f --- /dev/null +++ b/website/reference/constraint_unitnorm.html @@ -0,0 +1,182 @@ + + + + + + + + +UnitNorm weight constraint — constraint_unitnorm • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Constrains the weights incident to each hidden unit to have unit norm.

    + + +
    constraint_unitnorm(axis = 0)
    + +

    Arguments

    + + + + + + +
    axis

    The axis along which to calculate weight norms. For instance, in +a dense layer the weight matrix has shape input_dim, output_dim, +set axis to 0 to constrain each weight vector of length input_dim,. +In a convolution 2D layer with dim_ordering="tf", the weight tensor has +shape rows, cols, input_depth, output_depth, set axis to c(0, 1, 2) +to constrain the weights of each filter tensor of size rows, cols, input_depth.

    + +

    See also

    + +

    Other constraints: constraint_maxnorm, + constraint_minmaxnorm, + constraint_nonneg

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/count_params.html b/website/reference/count_params.html new file mode 100644 index 000000000..edeab8e65 --- /dev/null +++ b/website/reference/count_params.html @@ -0,0 +1,183 @@ + + + + + + + + +Count the total number of scalars composing the weights. — count_params • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Count the total number of scalars composing the weights.

    + + +
    count_params(object)
    + +

    Arguments

    + + + + + + +
    object

    Layer or model object

    + +

    Value

    + +

    An integer count

    + +

    See also

    + +

    Other layer methods: get_config, + get_input_at, get_weights, + reset_states

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/create_layer.html b/website/reference/create_layer.html new file mode 100644 index 000000000..920e32f8f --- /dev/null +++ b/website/reference/create_layer.html @@ -0,0 +1,192 @@ + + + + + + + + +Create a Keras Layer — create_layer • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Create a Keras Layer

    + + +
    create_layer(layer_class, object, args = list())
    + +

    Arguments

    + + + + + + + + + + + + + + +
    layer_class

    Python layer class or R6 class of type KerasLayer

    object

    Object to compose layer with. This is either a +keras_model_sequential() to add the layer to, or another Layer which +this layer will call.

    args

    List of arguments to layer constructor function

    + +

    Value

    + +

    A Keras layer

    + +

    Note

    + +

    The object parameter can be missing, in which case the +layer is created without a connection to an existing graph.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/dataset_boston_housing.html b/website/reference/dataset_boston_housing.html new file mode 100644 index 000000000..5a11d6f41 --- /dev/null +++ b/website/reference/dataset_boston_housing.html @@ -0,0 +1,199 @@ + + + + + + + + +Boston housing price regression dataset — dataset_boston_housing • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dataset taken from the StatLib library which is maintained at Carnegie Mellon +University.

    + + +
    dataset_boston_housing(path = "boston_housing.npz", seed = 113L,
    +  test_split = 0.2)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    path

    Path where to cache the dataset locally (relative to +~/.keras/datasets).

    seed

    Random seed for shuffling the data before computing the test +split.

    test_split

    fraction of the data to reserve as test set.

    + +

    Value

    + +

    Lists of training and test data: train$x, train$y, test$x, test$y.

    +

    Samples contain 13 attributes of houses at different locations around +the Boston suburbs in the late 1970s. Targets are the median values of the +houses at a location (in k$).

    + +

    See also

    + +

    Other datasets: dataset_cifar100, + dataset_cifar10, + dataset_imdb, dataset_mnist, + dataset_reuters

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/dataset_cifar10.html b/website/reference/dataset_cifar10.html new file mode 100644 index 000000000..d34c4c2b3 --- /dev/null +++ b/website/reference/dataset_cifar10.html @@ -0,0 +1,179 @@ + + + + + + + + +CIFAR10 small image classification — dataset_cifar10 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dataset of 50,000 32x32 color training images, labeled over 10 categories, +and 10,000 test images.

    + + +
    dataset_cifar10()
    + +

    Value

    + +

    Lists of training and test data: train$x, train$y, test$x, test$y.

    +

    The x data is an array of RGB image data with shape (num_samples, 3, 32, +32).

    +

    The y data is an array of category labels (integers in range 0-9) with +shape (num_samples).

    + +

    See also

    + +

    Other datasets: dataset_boston_housing, + dataset_cifar100, + dataset_imdb, dataset_mnist, + dataset_reuters

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/dataset_cifar100.html b/website/reference/dataset_cifar100.html new file mode 100644 index 000000000..1bbcd43b5 --- /dev/null +++ b/website/reference/dataset_cifar100.html @@ -0,0 +1,187 @@ + + + + + + + + +CIFAR100 small image classification — dataset_cifar100 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dataset of 50,000 32x32 color training images, labeled over 100 categories, +and 10,000 test images.

    + + +
    dataset_cifar100(label_mode = c("fine", "coarse"))
    + +

    Arguments

    + + + + + + +
    label_mode

    one of "fine", "coarse".

    + +

    Value

    + +

    Lists of training and test data: train$x, train$y, test$x, test$y.

    +

    The x data is an array of RGB image data with shape (num_samples, 3, 32, 32).

    +

    The y data is an array of category labels with shape (num_samples).

    + +

    See also

    + +

    Other datasets: dataset_boston_housing, + dataset_cifar10, + dataset_imdb, dataset_mnist, + dataset_reuters

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/dataset_imdb.html b/website/reference/dataset_imdb.html new file mode 100644 index 000000000..b68c49056 --- /dev/null +++ b/website/reference/dataset_imdb.html @@ -0,0 +1,236 @@ + + + + + + + + +IMDB Movie reviews sentiment classification — dataset_imdb • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dataset of 25,000 movies reviews from IMDB, labeled by sentiment +(positive/negative). Reviews have been preprocessed, and each review is +encoded as a sequence of word indexes (integers). For convenience, words are +indexed by overall frequency in the dataset, so that for instance the integer +"3" encodes the 3rd most frequent word in the data. This allows for quick +filtering operations such as: "only consider the top 10,000 most common +words, but eliminate the top 20 most common words".

    + + +
    dataset_imdb(path = "imdb.npz", num_words = NULL, skip_top = 0L,
    +  maxlen = NULL, seed = 113L, start_char = 1L, oov_char = 2L,
    +  index_from = 3L)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    path

    Where to cache the data (relative to ~/.keras/dataset).

    num_words

    Max number of words to include. Words are ranked by how +often they occur (in the training set) and only the most frequent words are +kept

    skip_top

    Skip the top N most frequently occuring words (which may not +be informative).

    maxlen

    Truncate sequences after this length.

    seed

    random seed for sample shuffling.

    start_char

    The start of a sequence will be marked with this character. +Set to 1 because 0 is usually the padding character.

    oov_char

    Words that were cut out because of the num_words or +skip_top limit will be replaced with this character.

    index_from

    Index actual words with this index and higher.

    + +

    Value

    + +

    Lists of training and test data: train$x, train$y, test$x, test$y.

    +

    The x data includes integer sequences. If the num_words`` argument was specific, the maximum possible index value isnum_words-1. If themaxlen`` +argument was specified, the largest possible sequence length is maxlen.

    +

    The y data includes a set of integer labels (0 or 1).

    + +

    Details

    + +

    As a convention, "0" does not stand for a specific word, but instead is used +to encode any unknown word.

    + +

    See also

    + +

    Other datasets: dataset_boston_housing, + dataset_cifar100, + dataset_cifar10, + dataset_mnist, + dataset_reuters

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/dataset_mnist.html b/website/reference/dataset_mnist.html new file mode 100644 index 000000000..284f7ac10 --- /dev/null +++ b/website/reference/dataset_mnist.html @@ -0,0 +1,186 @@ + + + + + + + + +MNIST database of handwritten digits — dataset_mnist • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.

    + + +
    dataset_mnist(path = "mnist.npz")
    + +

    Arguments

    + + + + + + +
    path

    Path where to cache the dataset locally (relative to ~/.keras/datasets).

    + +

    Value

    + +

    Lists of training and test data: train$x, train$y, test$x, test$y, where +x is an array of grayscale image data with shape (num_samples, 28, 28) and y +is an array of digit labels (integers in range 0-9) with shape (num_samples).

    + +

    See also

    + +

    Other datasets: dataset_boston_housing, + dataset_cifar100, + dataset_cifar10, + dataset_imdb, dataset_reuters

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/dataset_reuters.html b/website/reference/dataset_reuters.html new file mode 100644 index 000000000..19781f226 --- /dev/null +++ b/website/reference/dataset_reuters.html @@ -0,0 +1,231 @@ + + + + + + + + +Reuters newswire topics classification — dataset_reuters • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with +dataset_imdb() , each wire is encoded as a sequence of word indexes (same +conventions).

    + + +
    dataset_reuters(path = "reuters.npz", num_words = NULL, skip_top = 0L,
    +  maxlen = NULL, test_split = 0.2, seed = 113L, start_char = 1L,
    +  oov_char = 2L, index_from = 3L)
    +
    +dataset_reuters_word_index(path = "reuters_word_index.pkl")
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    path

    Where to cache the data (relative to ~/.keras/dataset).

    num_words

    Max number of words to include. Words are ranked by how +often they occur (in the training set) and only the most frequent words are +kept

    skip_top

    Skip the top N most frequently occuring words (which may not +be informative).

    maxlen

    Truncate sequences after this length.

    test_split

    Fraction of the dataset to be used as test data.

    seed

    Random seed for sample shuffling.

    start_char

    The start of a sequence will be marked with this character. +Set to 1 because 0 is usually the padding character.

    oov_char

    words that were cut out because of the num_words or +skip_top limit will be replaced with this character.

    index_from

    index actual words with this index and higher.

    + +

    Value

    + +

    Lists of training and test data: train$x, train$y, test$x, test$y +with same format as dataset_imdb(). The dataset_reuters_word_index() +function returns a list where the names are words and the values are +integer. e.g. word_index[["giraffe"]] might return 1234.

    +

    [["giraffe"]: R:[

    + +

    See also

    + +

    Other datasets: dataset_boston_housing, + dataset_cifar100, + dataset_cifar10, + dataset_imdb, dataset_mnist

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/evaluate.html b/website/reference/evaluate.html new file mode 100644 index 000000000..6120d34d4 --- /dev/null +++ b/website/reference/evaluate.html @@ -0,0 +1,223 @@ + + + + + + + + +Evaluate a Keras model — evaluate • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Evaluate a Keras model

    + + +
    evaluate(object, x, y, batch_size = 32, verbose = 1, sample_weight = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model object to evaluate

    x

    Vector, matrix, or array of training data (or list if the model has +multiple inputs). If all inputs in the model are named, you can also pass a +list mapping input names to data.

    y

    Vector, matrix, or array of target data (or list if the model has +multiple outputs). If all outputs in the model are named, you can also pass +a list mapping output names to data.

    batch_size

    Number of samples per gradient update.

    verbose

    Verbosity mode (0 = silent, 1 = verbose, 2 = one log line per +epoch).

    sample_weight

    Optional array of the same length as x, containing +weights to apply to the model's loss for each sample. In the case of +temporal data, you can pass a 2D array with shape (samples, +sequence_length), to apply a different weight to every timestep of every +sample. In this case you should make sure to specify +sample_weight_mode="temporal" in compile().

    + +

    Value

    + +

    Named list of model test loss (or losses for models with multiple outputs) +and model metrics.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/evaluate_generator.html b/website/reference/evaluate_generator.html new file mode 100644 index 000000000..8d5663a40 --- /dev/null +++ b/website/reference/evaluate_generator.html @@ -0,0 +1,208 @@ + + + + + + + + +Evaluates the model on a data generator. — evaluate_generator • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    The generator should return the same kind of data as accepted by +test_on_batch().

    + + +
    evaluate_generator(object, generator, steps, max_queue_size = 10)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    object

    Model object to evaluate

    generator

    Generator yielding lists (inputs, targets) or (inputs, +targets, sample_weights)

    steps

    Total number of steps (batches of samples) to yield from +generator before stopping.

    max_queue_size

    maximum size for the generator queue

    + +

    Value

    + +

    Named list of model test loss (or losses for models with multiple outputs) +and model metrics.

    + +

    See also

    + +

    Other model functions: compile, + evaluate, fit_generator, + fit, get_config, + get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/fit.html b/website/reference/fit.html new file mode 100644 index 000000000..bfa8ae3d4 --- /dev/null +++ b/website/reference/fit.html @@ -0,0 +1,269 @@ + + + + + + + + +Train a Keras model — fit • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Trains the model for a fixed number of epochs (iterations on a dataset).

    + + +
    fit(object, x, y, batch_size = 32, epochs = 10, verbose = 1,
    +  callbacks = NULL, view_metrics = getOption("keras.view_metrics", default =
    +  "auto"), validation_split = 0, validation_data = NULL, shuffle = TRUE,
    +  class_weight = NULL, sample_weight = NULL, initial_epoch = 0, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model to train.

    x

    Vector, matrix, or array of training data (or list if the model has +multiple inputs). If all inputs in the model are named, you can also pass a +list mapping input names to data.

    y

    Vector, matrix, or array of target data (or list if the model has +multiple outputs). If all outputs in the model are named, you can also pass +a list mapping output names to data.

    batch_size

    Number of samples per gradient update.

    epochs

    Number of times to iterate over the training data arrays.

    verbose

    Verbosity mode (0 = silent, 1 = verbose, 2 = one log line per +epoch).

    callbacks

    List of callbacks to be called during training.

    view_metrics

    View realtime plot of training metrics (by epoch). The +default ("auto") will display the plot when running within RStudio, +metrics were specified during model compile(), epochs > 1 and +verbose > 0. Use the global keras.view_metrics option to establish a +different default.

    validation_split

    Float between 0 and 1: fraction of the training data +to be used as validation data. The model will set apart this fraction of +the training data, will not train on it, and will evaluate the loss and any +model metrics on this data at the end of each epoch.

    validation_data

    Data on which to evaluate the loss and any model +metrics at the end of each epoch. The model will not be trained on this +data. This could be a list (x_val, y_val) or a list (x_val, y_val, +val_sample_weights).

    shuffle

    TRUE to shuffle the training data before each epoch.

    class_weight

    Optional named list mapping indices (integers) to a +weight (float) to apply to the model's loss for the samples from this class +during training. This can be useful to tell the model to "pay more +attention" to samples from an under-represented class.

    sample_weight

    Optional array of the same length as x, containing +weights to apply to the model's loss for each sample. In the case of +temporal data, you can pass a 2D array with shape (samples, +sequence_length), to apply a different weight to every timestep of every +sample. In this case you should make sure to specify +sample_weight_mode="temporal" in compile().

    initial_epoch

    epoch at which to start training (useful for resuming a +previous training run).

    ...

    Unused

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, get_config, + get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/fit_generator.html b/website/reference/fit_generator.html new file mode 100644 index 000000000..2a9e59d21 --- /dev/null +++ b/website/reference/fit_generator.html @@ -0,0 +1,268 @@ + + + + + + + + +Fits the model on data yielded batch-by-batch by a generator. — fit_generator • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    The generator is run in parallel to the model, for efficiency. For instance, +this allows you to do real-time data augmentation on images on CPU in +parallel to training your model on GPU.

    + + +
    fit_generator(object, generator, steps_per_epoch, epochs = 1, verbose = 1,
    +  callbacks = NULL, view_metrics = getOption("keras.view_metrics", default =
    +  "auto"), validation_data = NULL, validation_steps = NULL,
    +  class_weight = NULL, max_queue_size = 10, initial_epoch = 0)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Keras model object

    generator

    A generator (e.g. like the one provided by +flow_images_from_directory() or a custom R generator function).

    +

    The output of the generator must be a list of one of these forms:

     - (inputs, targets)
    + - (inputs, targets, sample_weights)
    +
    + +

    Note that the generator should call the to_numpy_array() function on its +results prior to returning them (this ensures that arrays are provided in +'C' order and using the default floating point type for the backend.)

    +

    All arrays should contain the same number of samples. The generator is expected +to loop over its data indefinitely. An epoch finishes when steps_per_epoch +batches have been seen by the model.

    steps_per_epoch

    Total number of steps (batches of samples) to yield +from generator before declaring one epoch finished and starting the next +epoch. It should typically be equal to the number of unique samples if your +dataset divided by the batch size.

    epochs

    integer, total number of iterations on the data.

    verbose

    Verbosity mode (0 = silent, 1 = verbose, 2 = one log line per +epoch).

    callbacks

    list of callbacks to be called during training.

    view_metrics

    View realtime plot of training metrics (by epoch). The +default ("auto") will display the plot when running within RStudio, +metrics were specified during model compile(), epochs > 1 and +verbose > 0. Use the global keras.view_metrics option to establish a +different default.

    validation_data

    this can be either:

      +
    • a generator for the validation data

    • +
    • a list (inputs, targets)

    • +
    • a list (inputs, targets, sample_weights).

    • +
    validation_steps

    Only relevant if validation_data is a generator. +Total number of steps (batches of samples) to yield from generator before +stopping.

    class_weight

    dictionary mapping class indices to a weight for the +class.

    max_queue_size

    maximum size for the generator queue

    initial_epoch

    epoch at which to start training (useful for resuming a +previous training run)

    + +

    Value

    + +

    Training history object (invisibly)

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit, get_config, + get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/fit_image_data_generator.html b/website/reference/fit_image_data_generator.html new file mode 100644 index 000000000..16e6b806d --- /dev/null +++ b/website/reference/fit_image_data_generator.html @@ -0,0 +1,200 @@ + + + + + + + + +Fit image data generator internal statistics to some sample data. — fit_image_data_generator • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Required for featurewise_center, featurewise_std_normalization +and zca_whitening.

    + + +
    fit_image_data_generator(object, x, augment = FALSE, rounds = 1,
    +  seed = NULL, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    image_data_generator()

    x

    array, the data to fit on (should have rank 4). In case of grayscale data, +the channels axis should have value 1, and in case of RGB data, it should have value 3.

    augment

    Whether to fit on randomly augmented samples

    rounds

    If augment, how many augmentation passes to do over the data

    seed

    random seed.

    ...

    Unused

    + +

    See also

    + +

    Other image preprocessing: flow_images_from_data, + flow_images_from_directory, + image_load, image_to_array

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/fit_text_tokenizer.html b/website/reference/fit_text_tokenizer.html new file mode 100644 index 000000000..4d9172474 --- /dev/null +++ b/website/reference/fit_text_tokenizer.html @@ -0,0 +1,199 @@ + + + + + + + + +Update tokenizer internal vocabulary based on a list of texts or list of +sequences. — fit_text_tokenizer • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Update tokenizer internal vocabulary based on a list of texts or list of +sequences.

    + + +
    fit_text_tokenizer(object, x, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    object

    Tokenizer returned by text_tokenizer()

    x

    Vector/list of strings, or a generator of strings (for +memory-efficiency); Alternatively a list of "sequence" (a sequence is a +list of integer word indices).

    ...

    Unused

    + +

    Note

    + +

    Required before using texts_to_sequences(), texts_to_matrix(), or +sequences_to_matrix().

    + +

    See also

    + +

    Other text tokenization: sequences_to_matrix, + text_tokenizer, + texts_to_matrix, + texts_to_sequences_generator, + texts_to_sequences

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/flow_images_from_data.html b/website/reference/flow_images_from_data.html new file mode 100644 index 000000000..dc5fb2109 --- /dev/null +++ b/website/reference/flow_images_from_data.html @@ -0,0 +1,230 @@ + + + + + + + + +Generates batches of augmented/normalized data from image data and labels — flow_images_from_data • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Generates batches of augmented/normalized data from image data and labels

    + + +
    flow_images_from_data(x, y = NULL, generator = image_data_generator(),
    +  batch_size = 32, shuffle = TRUE, seed = NULL, save_to_dir = NULL,
    +  save_prefix = "", save_format = "png")
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    x

    data. Should have rank 4. In case of grayscale data, the channels +axis should have value 1, and in case of RGB data, it should have value 3.

    y

    labels (can be NULL if no labels are required)

    generator

    Image data generator to use for augmenting/normalizing image +data.

    batch_size

    int (default: 32).

    shuffle

    boolean (defaut: TRUE).

    seed

    int (default: NULL).

    save_to_dir

    NULL or str (default: NULL). This allows you to +optimally specify a directory to which to save the augmented pictures being +generated (useful for visualizing what you are doing).

    save_prefix

    str (default: ''). Prefix to use for filenames of saved +pictures (only relevant if save_to_dir is set).

    save_format

    one of "png", "jpeg" (only relevant if save_to_dir is +set). Default: "png".

    + +

    Details

    + +

    Yields batches indefinitely, in an infinite loop.

    + +

    Yields

    + +

    (x, y) where x is an array of image data and y is a +array of corresponding labels. The generator loops indefinitely.

    + +

    See also

    + +

    Other image preprocessing: fit_image_data_generator, + flow_images_from_directory, + image_load, image_to_array

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/flow_images_from_directory.html b/website/reference/flow_images_from_directory.html new file mode 100644 index 000000000..5f371279d --- /dev/null +++ b/website/reference/flow_images_from_directory.html @@ -0,0 +1,263 @@ + + + + + + + + +Generates batches of data from images in a directory (with optional +augmented/normalized data) — flow_images_from_directory • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Generates batches of data from images in a directory (with optional +augmented/normalized data)

    + + +
    flow_images_from_directory(directory, generator = image_data_generator(),
    +  target_size = c(256, 256), color_mode = "rgb", classes = NULL,
    +  class_mode = "categorical", batch_size = 32, shuffle = TRUE,
    +  seed = NULL, save_to_dir = NULL, save_prefix = "",
    +  save_format = "png", follow_links = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    directory

    path to the target directory. It should contain one +subdirectory per class. Any PNG, JPG or BMP images inside each of the +subdirectories directory tree will be included in the generator. See thisscript +for more details.

    generator

    Image data generator (default generator does no data +augmentation/normalization transformations)

    target_size

    integer vectir, default: c(256, 256). The dimensions to +which all images found will be resized.

    color_mode

    one of "grayscale", "rbg". Default: "rgb". Whether the +images will be converted to have 1 or 3 color channels.

    classes

    optional list of class subdirectories (e.g. c('dogs', 'cats')). Default: NULL, If not provided, the list of classes will be +automatically inferred (and the order of the classes, which will map to the +label indices, will be alphanumeric).

    class_mode

    one of "categorical", "binary", "sparse" or NULL. +Default: "categorical". Determines the type of label arrays that are +returned: "categorical" will be 2D one-hot encoded labels, "binary" will be +1D binary labels, "sparse" will be 1D integer labels. If NULL, no labels +are returned (the generator will only yield batches of image data, which is +useful to use predict_generator(), evaluate_generator(), etc.).

    batch_size

    int (default: 32).

    shuffle

    boolean (defaut: TRUE).

    seed

    int (default: NULL).

    save_to_dir

    NULL or str (default: NULL). This allows you to +optimally specify a directory to which to save the augmented pictures being +generated (useful for visualizing what you are doing).

    save_prefix

    str (default: ''). Prefix to use for filenames of saved +pictures (only relevant if save_to_dir is set).

    save_format

    one of "png", "jpeg" (only relevant if save_to_dir is +set). Default: "png".

    follow_links

    whether to follow symlinks inside class subdirectories +(default: FALSE)

    + +

    Details

    + +

    Yields batches indefinitely, in an infinite loop.

    + +

    Yields

    + +

    (x, y) where x is an array of image data and y is a +array of corresponding labels. The generator loops indefinitely.

    + +

    See also

    + +

    Other image preprocessing: fit_image_data_generator, + flow_images_from_data, + image_load, image_to_array

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/get_config.html b/website/reference/get_config.html new file mode 100644 index 000000000..3347aa9ae --- /dev/null +++ b/website/reference/get_config.html @@ -0,0 +1,216 @@ + + + + + + + + +Layer/Model configuration — get_config • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    A layer config is an object returned from get_config() that contains the +configuration of a layer or model. The same layer or model can be +reinstantiated later (without its trained weights) from this configuration +using from_config(). The config does not include connectivity information, +nor the class name (those are handled externally).

    + + +
    get_config(object)
    +
    +from_config(config)
    + +

    Arguments

    + + + + + + + + + + +
    object

    Layer or model object

    config

    Object with layer or model configuration

    + +

    Value

    + +

    get_config() returns an object with the configuration, +from_config() returns a re-instantation of hte object.

    + +

    Note

    + +

    Objects returned from get_config() are not serializable. Therefore, +if you want to save and restore a model across sessions, you can use the +model_to_json() or model_to_yaml() functions (for model configuration +only, not weights) or the save_model_hdf5() function to save the model +configuration and weights to a file.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    +

    Other layer methods: count_params, + get_input_at, get_weights, + reset_states

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/get_file.html b/website/reference/get_file.html new file mode 100644 index 000000000..b7a904b0d --- /dev/null +++ b/website/reference/get_file.html @@ -0,0 +1,216 @@ + + + + + + + + +Downloads a file from a URL if it not already in the cache. — get_file • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Passing the MD5 hash will verify the file after download as well as if it is +already present in the cache.

    + + +
    get_file(fname, origin, file_hash = NULL, cache_subdir = "datasets",
    +  hash_algorithm = "auto", extract = FALSE, archive_format = "auto",
    +  cache_dir = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    fname

    Name of the file. If an absolute path /path/to/file.txt is +specified the file will be saved at that location.

    origin

    Original URL of the file.

    file_hash

    The expected hash string of the file after download. The +sha256 and md5 hash algorithms are both supported.

    cache_subdir

    Subdirectory under the Keras cache dir where the file is +saved. If an absolute path /path/to/folder is specified the file will be +saved at that location.

    hash_algorithm

    Select the hash algorithm to verify the file. options +are 'md5', 'sha256', and 'auto'. The default 'auto' detects the hash +algorithm in use.

    extract

    True tries extracting the file as an Archive, like tar or zip.

    archive_format

    Archive format to try for extracting the file. Options +are 'auto', 'tar', 'zip', and None. 'tar' includes tar, tar.gz, and tar.bz +files. The default 'auto' is ('tar', 'zip'). None or an empty list will +return no matches found.

    cache_dir

    Location to store cached files, when NULL it defaults to +the Keras configuration directory.

    + +

    Value

    + +

    Path to the downloaded file

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/get_input_at.html b/website/reference/get_input_at.html new file mode 100644 index 000000000..e8b8fc59f --- /dev/null +++ b/website/reference/get_input_at.html @@ -0,0 +1,204 @@ + + + + + + + + +Retrieve tensors for layers with multiple nodes — get_input_at • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Whenever you are calling a layer on some input, you are creating a new tensor +(the output of the layer), and you are adding a "node" to the layer, linking +the input tensor to the output tensor. When you are calling the same layer +multiple times, that layer owns multiple nodes indexed as 1, 2, 3. These +functions enable you to retreive various tensor properties of layers with +multiple nodes.

    + + +
    get_input_at(object, node_index)
    +
    +get_output_at(object, node_index)
    +
    +get_input_shape_at(object, node_index)
    +
    +get_output_shape_at(object, node_index)
    +
    +get_input_mask_at(object, node_index)
    +
    +get_output_mask_at(object, node_index)
    + +

    Arguments

    + + + + + + + + + + +
    object

    Layer or model object

    node_index

    Integer, index of the node from which to retrieve the +attribute. E.g. node_index = 1 will correspond to the first time the +layer was called.

    + +

    Value

    + +

    A tensor (or list of tensors if the layer has multiple inputs/outputs).

    + +

    See also

    + +

    Other layer methods: count_params, + get_config, get_weights, + reset_states

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/get_layer.html b/website/reference/get_layer.html new file mode 100644 index 000000000..09bbaaf88 --- /dev/null +++ b/website/reference/get_layer.html @@ -0,0 +1,201 @@ + + + + + + + + +Retrieves a layer based on either its name (unique) or index. — get_layer • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Indices are based on order of horizontal graph traversal (bottom-up) and +are 0-based.

    + + +
    get_layer(object, name = NULL, index = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    object

    Keras model object

    name

    String, name of layer.

    index

    Integer, index of layer (0-based)

    + +

    Value

    + +

    A layer instance.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/get_weights.html b/website/reference/get_weights.html new file mode 100644 index 000000000..d970eed34 --- /dev/null +++ b/website/reference/get_weights.html @@ -0,0 +1,188 @@ + + + + + + + + +Layer/Model weights as R arrays — get_weights • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Layer/Model weights as R arrays

    + + +
    get_weights(object)
    +
    +set_weights(object, weights)
    + +

    Arguments

    + + + + + + + + + + +
    object

    Layer or model object

    weights

    Weights as R array

    + +

    See also

    + +

    Other model persistence: model_to_json, + model_to_yaml, + save_model_hdf5, + save_model_weights_hdf5, + serialize_model

    +

    Other layer methods: count_params, + get_config, get_input_at, + reset_states

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/hdf5_matrix.html b/website/reference/hdf5_matrix.html new file mode 100644 index 000000000..4530a60f5 --- /dev/null +++ b/website/reference/hdf5_matrix.html @@ -0,0 +1,199 @@ + + + + + + + + +Representation of HDF5 dataset to be used instead of an R array — hdf5_matrix • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Representation of HDF5 dataset to be used instead of an R array

    + + +
    hdf5_matrix(datapath, dataset, start = 0, end = NULL, normalizer = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    datapath

    string, path to a HDF5 file

    dataset

    string, name of the HDF5 dataset in the file specified in datapath

    start

    int, start of desired slice of the specified dataset

    end

    int, end of desired slice of the specified dataset

    normalizer

    function to be called on data when retrieved

    + +

    Value

    + +

    An array-like HDF5 dataset.

    + +

    Details

    + +

    Providing start and end allows use of a slice of the dataset.

    +

    Optionally, a normalizer function (or lambda) can be given. This will +be called on every slice of data retrieved.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/image_data_generator.html b/website/reference/image_data_generator.html new file mode 100644 index 000000000..73c50f1a8 --- /dev/null +++ b/website/reference/image_data_generator.html @@ -0,0 +1,262 @@ + + + + + + + + +Generate minibatches of image data with real-time data augmentation. — image_data_generator • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Generate minibatches of image data with real-time data augmentation.

    + + +
    image_data_generator(featurewise_center = FALSE, samplewise_center = FALSE,
    +  featurewise_std_normalization = FALSE,
    +  samplewise_std_normalization = FALSE, zca_whitening = FALSE,
    +  zca_epsilon = 1e-06, rotation_range = 0, width_shift_range = 0,
    +  height_shift_range = 0, shear_range = 0, zoom_range = 0,
    +  channel_shift_range = 0, fill_mode = "nearest", cval = 0,
    +  horizontal_flip = FALSE, vertical_flip = FALSE, rescale = NULL,
    +  preprocessing_function = NULL, data_format = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    featurewise_center

    set input mean to 0 over the dataset.

    samplewise_center

    set each sample mean to 0.

    featurewise_std_normalization

    divide inputs by std of the dataset.

    samplewise_std_normalization

    divide each input by its std.

    zca_whitening

    apply ZCA whitening.

    zca_epsilon

    Epsilon for ZCA whitening. Default is 1e-6.

    rotation_range

    degrees (0 to 180).

    width_shift_range

    fraction of total width.

    height_shift_range

    fraction of total height.

    shear_range

    shear intensity (shear angle in radians).

    zoom_range

    amount of zoom. if scalar z, zoom will be randomly picked +in the range [1-z, 1+z]. A sequence of two can be passed instead to select +this range.

    channel_shift_range

    shift range for each channels.

    fill_mode

    points outside the boundaries are filled according to the +given mode ('constant', 'nearest', 'reflect' or 'wrap'). Default is +'nearest'.

    cval

    value used for points outside the boundaries when fill_mode is +'constant'. Default is 0.

    horizontal_flip

    whether to randomly flip images horizontally.

    vertical_flip

    whether to randomly flip images vertically.

    rescale

    rescaling factor. If NULL or 0, no rescaling is applied, +otherwise we multiply the data by the value provided (before applying any +other transformation).

    preprocessing_function

    function that will be implied on each input. +The function will run before any other modification on it. The function +should take one argument: one image (tensor with rank 3), and should +output a tensor with the same shape.

    data_format

    'channels_first' or 'channels_last'. In 'channels_first' +mode, the channels dimension (the depth) is at index 1, in 'channels_last' +mode it is at index 3. It defaults to the image_data_format value found +in your Keras config file at ~/.keras/keras.json. If you never set it, +then it will be "channels_last".

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/image_load.html b/website/reference/image_load.html new file mode 100644 index 000000000..f9e792aab --- /dev/null +++ b/website/reference/image_load.html @@ -0,0 +1,192 @@ + + + + + + + + +Loads an image into PIL format. — image_load • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Loads an image into PIL format.

    + + +
    image_load(path, grayscale = FALSE, target_size = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    path

    Path to image file

    grayscale

    Boolean, whether to load the image as grayscale.

    target_size

    Either NULL (default to original size) or integer vector (img_height, img_width).

    + +

    Value

    + +

    A PIL Image instance.

    + +

    See also

    + +

    Other image preprocessing: fit_image_data_generator, + flow_images_from_data, + flow_images_from_directory, + image_to_array

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/image_to_array.html b/website/reference/image_to_array.html new file mode 100644 index 000000000..657d3a661 --- /dev/null +++ b/website/reference/image_to_array.html @@ -0,0 +1,188 @@ + + + + + + + + +Converts a PIL Image instance to a 3d-array. — image_to_array • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Converts a PIL Image instance to a 3d-array.

    + + +
    image_to_array(img, data_format = c("channels_last", "channels_first"))
    + +

    Arguments

    + + + + + + + + + + +
    img

    PIL Image instance.

    data_format

    Image data format ("channels_last" or "channels_first")

    + +

    Value

    + +

    A 3D array.

    + +

    See also

    + +

    Other image preprocessing: fit_image_data_generator, + flow_images_from_data, + flow_images_from_directory, + image_load

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/imagenet_decode_predictions.html b/website/reference/imagenet_decode_predictions.html new file mode 100644 index 000000000..bb2e74969 --- /dev/null +++ b/website/reference/imagenet_decode_predictions.html @@ -0,0 +1,180 @@ + + + + + + + + +Decodes the prediction of an ImageNet model. — imagenet_decode_predictions • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Decodes the prediction of an ImageNet model.

    + + +
    imagenet_decode_predictions(preds, top = 5)
    + +

    Arguments

    + + + + + + + + + + +
    preds

    Tensor encoding a batch of predictions.

    top

    integer, how many top-guesses to return.

    + +

    Value

    + +

    List of data frames with variables class_name, class_description, +and score (one data frame per sample in batch input).

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/imagenet_preprocess_input.html b/website/reference/imagenet_preprocess_input.html new file mode 100644 index 000000000..864c2c114 --- /dev/null +++ b/website/reference/imagenet_preprocess_input.html @@ -0,0 +1,175 @@ + + + + + + + + +Preprocesses a tensor encoding a batch of images. — imagenet_preprocess_input • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Preprocesses a tensor encoding a batch of images.

    + + +
    imagenet_preprocess_input(x)
    + +

    Arguments

    + + + + + + +
    x

    input tensor, 4D

    + +

    Value

    + +

    Preprocessed tensor

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/implementation.html b/website/reference/implementation.html new file mode 100644 index 000000000..86b49537e --- /dev/null +++ b/website/reference/implementation.html @@ -0,0 +1,178 @@ + + + + + + + + +Keras implementation — implementation • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Obtain a reference to the Python module used for the implementation of Keras.

    + + +
    implementation()
    + +

    Value

    + +

    Reference to the Python module used for the implementation of Keras.

    + +

    Details

    + +

    There are currently two Python modules which implement Keras:

      +
    • keras ("keras")

    • +
    • tensorflow.contrib.keras ("tensorflow")

    • +
    +

    This function returns a reference to the implementation being currently +used by the keras package. The default implementation is "keras". +You can override this by setting the KERAS_IMPLEMENTATION environment +variable to "tensorflow".

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_constant.html b/website/reference/initializer_constant.html new file mode 100644 index 000000000..ddec522a4 --- /dev/null +++ b/website/reference/initializer_constant.html @@ -0,0 +1,188 @@ + + + + + + + + +Initializer that generates tensors initialized to a constant value. — initializer_constant • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Initializer that generates tensors initialized to a constant value.

    + + +
    initializer_constant(value = 0)
    + +

    Arguments

    + + + + + + +
    value

    float; the value of the generator tensors.

    + +

    See also

    + +

    Other initializers: initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_glorot_normal.html b/website/reference/initializer_glorot_normal.html new file mode 100644 index 000000000..1c40c4d91 --- /dev/null +++ b/website/reference/initializer_glorot_normal.html @@ -0,0 +1,198 @@ + + + + + + + + +Glorot normal initializer, also called Xavier normal initializer. — initializer_glorot_normal • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It draws samples from a truncated normal distribution centered on 0 +with stddev = sqrt(2 / (fan_in + fan_out)) +where fan_in is the number of input units in the weight tensor +and fan_out is the number of output units in the weight tensor.

    + + +
    initializer_glorot_normal(seed = NULL)
    + +

    Arguments

    + + + + + + +
    seed

    Integer used to seed the random generator.

    + +

    References

    + + +

    Glorot & Bengio, AISTATS 2010 http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_glorot_uniform.html b/website/reference/initializer_glorot_uniform.html new file mode 100644 index 000000000..f0f08a909 --- /dev/null +++ b/website/reference/initializer_glorot_uniform.html @@ -0,0 +1,198 @@ + + + + + + + + +Glorot uniform initializer, also called Xavier uniform initializer. — initializer_glorot_uniform • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It draws samples from a uniform distribution within -limit, limit +where limit is sqrt(6 / (fan_in + fan_out)) +where fan_in is the number of input units in the weight tensor +and fan_out is the number of output units in the weight tensor.

    + + +
    initializer_glorot_uniform(seed = NULL)
    + +

    Arguments

    + + + + + + +
    seed

    Integer used to seed the random generator.

    + +

    References

    + + +

    Glorot & Bengio, AISTATS 2010 http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_he_normal.html b/website/reference/initializer_he_normal.html new file mode 100644 index 000000000..9d13b0419 --- /dev/null +++ b/website/reference/initializer_he_normal.html @@ -0,0 +1,196 @@ + + + + + + + + +He normal initializer. — initializer_he_normal • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It draws samples from a truncated normal distribution centered on 0 with +stddev = sqrt(2 / fan_in) where fan_in is the number of input units in +the weight tensor.

    + + +
    initializer_he_normal(seed = NULL)
    + +

    Arguments

    + + + + + + +
    seed

    Integer used to seed the random generator.

    + +

    References

    + +

    He et al., http://arxiv.org/abs/1502.01852

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_he_uniform.html b/website/reference/initializer_he_uniform.html new file mode 100644 index 000000000..72dac0d1c --- /dev/null +++ b/website/reference/initializer_he_uniform.html @@ -0,0 +1,196 @@ + + + + + + + + +He uniform variance scaling initializer. — initializer_he_uniform • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It draws samples from a uniform distribution within -limit, limit where +limit`` issqrt(6 / fan_in)wherefan_in` is the number of input units in the +weight tensor.

    + + +
    initializer_he_uniform(seed = NULL)
    + +

    Arguments

    + + + + + + +
    seed

    Integer used to seed the random generator.

    + +

    References

    + +

    He et al., http://arxiv.org/abs/1502.01852

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_identity.html b/website/reference/initializer_identity.html new file mode 100644 index 000000000..ef64c3307 --- /dev/null +++ b/website/reference/initializer_identity.html @@ -0,0 +1,188 @@ + + + + + + + + +Initializer that generates the identity matrix. — initializer_identity • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Only use for square 2D matrices.

    + + +
    initializer_identity(gain = 1)
    + +

    Arguments

    + + + + + + +
    gain

    Multiplicative factor to apply to the identity matrix

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_lecun_normal.html b/website/reference/initializer_lecun_normal.html new file mode 100644 index 000000000..e87395801 --- /dev/null +++ b/website/reference/initializer_lecun_normal.html @@ -0,0 +1,200 @@ + + + + + + + + +LeCun normal initializer. — initializer_lecun_normal • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It draws samples from a truncated normal distribution centered on 0 with +stddev <- sqrt(1 / fan_in) where fan_in is the number of input units in +the weight tensor..

    + + +
    initializer_lecun_normal(seed = NULL)
    + +

    Arguments

    + + + + + + +
    seed

    A Python integer. Used to seed the random generator.

    + +

    References

    + + + + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_lecun_uniform.html b/website/reference/initializer_lecun_uniform.html new file mode 100644 index 000000000..8148c96d6 --- /dev/null +++ b/website/reference/initializer_lecun_uniform.html @@ -0,0 +1,197 @@ + + + + + + + + +LeCun uniform initializer. — initializer_lecun_uniform • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It draws samples from a uniform distribution within -limit, limit where +limit is sqrt(3 / fan_in) where fan_in is the number of input units in +the weight tensor.

    + + +
    initializer_lecun_uniform(seed = NULL)
    + +

    Arguments

    + + + + + + +
    seed

    Integer used to seed the random generator.

    + +

    References

    + +

    LeCun 98, Efficient Backprop, +http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_ones.html b/website/reference/initializer_ones.html new file mode 100644 index 000000000..a82733ac1 --- /dev/null +++ b/website/reference/initializer_ones.html @@ -0,0 +1,178 @@ + + + + + + + + +Initializer that generates tensors initialized to 1. — initializer_ones • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Initializer that generates tensors initialized to 1.

    + + +
    initializer_ones()
    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_orthogonal.html b/website/reference/initializer_orthogonal.html new file mode 100644 index 000000000..61fba3baf --- /dev/null +++ b/website/reference/initializer_orthogonal.html @@ -0,0 +1,199 @@ + + + + + + + + +Initializer that generates a random orthogonal matrix. — initializer_orthogonal • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Initializer that generates a random orthogonal matrix.

    + + +
    initializer_orthogonal(gain = 1, seed = NULL)
    + +

    Arguments

    + + + + + + + + + + +
    gain

    Multiplicative factor to apply to the orthogonal matrix.

    seed

    Integer used to seed the random generator.

    + +

    References

    + + +

    Saxe et al., http://arxiv.org/abs/1312.6120

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_random_normal.html b/website/reference/initializer_random_normal.html new file mode 100644 index 000000000..b06f4abc7 --- /dev/null +++ b/website/reference/initializer_random_normal.html @@ -0,0 +1,196 @@ + + + + + + + + +Initializer that generates tensors with a normal distribution. — initializer_random_normal • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Initializer that generates tensors with a normal distribution.

    + + +
    initializer_random_normal(mean = 0, stddev = 0.05, seed = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    mean

    Mean of the random values to generate.

    stddev

    Standard deviation of the random values to generate.

    seed

    Integer used to seed the random generator.

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_random_uniform.html b/website/reference/initializer_random_uniform.html new file mode 100644 index 000000000..42780b5c5 --- /dev/null +++ b/website/reference/initializer_random_uniform.html @@ -0,0 +1,196 @@ + + + + + + + + +Initializer that generates tensors with a uniform distribution. — initializer_random_uniform • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Initializer that generates tensors with a uniform distribution.

    + + +
    initializer_random_uniform(minval = -0.05, maxval = 0.05, seed = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    minval

    Lower bound of the range of random values to generate.

    maxval

    Upper bound of the range of random values to generate. Defaults to 1 for float types.

    seed

    seed

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_truncated_normal, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_truncated_normal.html b/website/reference/initializer_truncated_normal.html new file mode 100644 index 000000000..ba509ca4a --- /dev/null +++ b/website/reference/initializer_truncated_normal.html @@ -0,0 +1,199 @@ + + + + + + + + +Initializer that generates a truncated normal distribution. — initializer_truncated_normal • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    These values are similar to values from an initializer_random_normal() +except that values more than two standard deviations from the mean +are discarded and re-drawn. This is the recommended initializer for +neural network weights and filters.

    + + +
    initializer_truncated_normal(mean = 0, stddev = 0.05, seed = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    mean

    Mean of the random values to generate.

    stddev

    Standard deviation of the random values to generate.

    seed

    Integer used to seed the random generator.

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_variance_scaling, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_variance_scaling.html b/website/reference/initializer_variance_scaling.html new file mode 100644 index 000000000..5bc578e9f --- /dev/null +++ b/website/reference/initializer_variance_scaling.html @@ -0,0 +1,213 @@ + + + + + + + + +Initializer capable of adapting its scale to the shape of weights. — initializer_variance_scaling • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    With distribution="normal", samples are drawn from a truncated normal +distribution centered on zero, with stddev = sqrt(scale / n) where n is:

      +
    • number of input units in the weight tensor, if mode = "fan_in"

    • +
    • number of output units, if mode = "fan_out"

    • +
    • average of the numbers of input and output units, if mode = "fan_avg"

    • +
    + + +
    initializer_variance_scaling(scale = 1, mode = c("fan_in", "fan_out",
    +  "fan_avg"), distribution = c("normal", "uniform"), seed = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    scale

    Scaling factor (positive float).

    mode

    One of "fan_in", "fan_out", "fan_avg".

    distribution

    One of "normal", "uniform"

    seed

    Integer used to seed the random generator.

    + +

    Details

    + +

    With distribution="uniform", samples are drawn from a uniform distribution +within -limit, limit, with limit = sqrt(3 * scale / n).

    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_zeros

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/initializer_zeros.html b/website/reference/initializer_zeros.html new file mode 100644 index 000000000..559e23ced --- /dev/null +++ b/website/reference/initializer_zeros.html @@ -0,0 +1,178 @@ + + + + + + + + +Initializer that generates tensors initialized to 0. — initializer_zeros • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Initializer that generates tensors initialized to 0.

    + + +
    initializer_zeros()
    + +

    See also

    + +

    Other initializers: initializer_constant, + initializer_glorot_normal, + initializer_glorot_uniform, + initializer_he_normal, + initializer_he_uniform, + initializer_identity, + initializer_lecun_normal, + initializer_lecun_uniform, + initializer_ones, + initializer_orthogonal, + initializer_random_normal, + initializer_random_uniform, + initializer_truncated_normal, + initializer_variance_scaling

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/install_keras.html b/website/reference/install_keras.html new file mode 100644 index 000000000..6d73c66a7 --- /dev/null +++ b/website/reference/install_keras.html @@ -0,0 +1,269 @@ + + + + + + + + +Install Keras and the TensorFlow backend — install_keras • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Keras and TensorFlow will be installed into an "r-tensorflow" virtual or conda +environment. Note that "virtualenv" is not available on Windows (as this isn't +supported by TensorFlow).

    + + +
    install_keras(method = c("virtualenv", "conda"), conda = "auto",
    +  tensorflow = "default", extra_packages = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    method

    Installation method ("virtualenv" or "conda")

    conda

    Path to conda executable (or "auto" to find conda using the PATH +and other conventional install locations).

    tensorflow

    TensorFlow version to install. Specify "default" to install +the CPU version of the latest release. Specify "gpu" to install the GPU +version of the latest release.

    +

    You can also provide a full major.minor.patch specification (e.g. "1.1.0"), +appending "-gpu" if you want the GPU version (e.g. "1.1.0-gpu").

    +

    Alternatively, you can provide the full URL to an installer binary (e.g. +for a nightly binary).

    extra_packages

    Additional PyPI packages to install along with +Keras and TensorFlow.

    + +

    GPU Installation

    + + +

    Keras and TensorFlow can be configured to run on either CPUs or GPUs. The CPU +version is much easier to install and configure so is the best starting place +especially when you are first learning how to use Keras. Here's the guidance +on CPU vs. GPU versions from the TensorFlow website:

      +
    • TensorFlow with CPU support only. If your system does not have a NVIDIA® GPU, +you must install this version. Note that this version of TensorFlow is typically +much easier to install, so even if you have an NVIDIA GPU, we recommend installing +this version first.

    • +
    • TensorFlow with GPU support. TensorFlow programs typically run significantly +faster on a GPU than on a CPU. Therefore, if your system has a NVIDIA® GPU meeting +all prerequisites and you need to run performance-critical applications, you should +ultimately install this version.

    • +
    +

    To install the GPU version:

      +
    1. Ensure that you have met all installation prerequisites including installation +of the CUDA and cuDNN libraries as described in TensorFlow GPU Prerequistes.

    2. +
    3. Pass tensorflow = "gpu" to install_keras(). For example:

        install_keras(tensorflow = "gpu")
      +
    4. +
    + +

    Windows Installation

    + + +

    The only supported installation method on Windows is "conda". This means that you +should install Anaconda 3.x for Windows prior to installing Keras.

    + +

    Custom Installation

    + + +

    Installing Keras and TensorFlow using install_keras() isn't required +to use the Keras R package. You can do a custom installation of Keras (and +desired backend) as described on the Keras website +and the Keras R package will find and use that version.

    +

    See the documentation on custom installations +for additional information on how version of Keras and TensorFlow are located +by the Keras package.

    + +

    Additional Packages

    + + +

    If you wish to add additional PyPI packages to your Keras / TensorFlow environment you +can either specify the packages in the extra_packages argument of install_keras(), +or alternatively install them into an existing environment using the +install_tensorflow_extras() function.

    + + +

    Examples

    +
    # NOT RUN {
    +# default installation
    +library(keras)
    +install_keras()
    +
    +# install using a conda environment (default is virtualenv)
    +install_keras(method = "conda")
    +
    +# install with GPU version of TensorFlow
    +# (NOTE: only do this if you have an NVIDIA GPU + CUDA!)
    +install_keras(tensorflow = "gpu")
    +
    +# install a specific version of TensorFlow
    +install_keras(tensorflow = "1.2.1")
    +install_keras(tensorflow = "1.2.1-gpu")
    +
    +# }
    +
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/is_keras_available.html b/website/reference/is_keras_available.html new file mode 100644 index 000000000..00e3966a7 --- /dev/null +++ b/website/reference/is_keras_available.html @@ -0,0 +1,195 @@ + + + + + + + + +Check if Keras is Available — is_keras_available • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Probe to see whether the Keras python package is available in the current +system environment.

    + + +
    is_keras_available(version = NULL)
    + +

    Arguments

    + + + + + + +
    version

    Minimum required version of Keras (defaults to NULL, no +required version).

    + +

    Value

    + +

    Logical indicating whether Keras (or the specified minimum version of +Keras) is available.

    + + +

    Examples

    +
    # NOT RUN {
    +# testthat utilty for skipping tests when Keras isn't available
    +skip_if_no_keras <- function(version = NULL) {
    +  if (!is_keras_available(version))
    +    skip("Required keras version not available for testing")
    +}
    +
    +# use the function within a test
    +test_that("keras function works correctly", {
    +  skip_if_no_keras()
    +  # test code here
    +})
    +# }
    +
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/keras_model.html b/website/reference/keras_model.html new file mode 100644 index 000000000..6041aec89 --- /dev/null +++ b/website/reference/keras_model.html @@ -0,0 +1,213 @@ + + + + + + + + +Keras Model — keras_model • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    A model is a directed acyclic graph of layers.

    + + +
    keras_model(inputs, outputs = NULL)
    + +

    Arguments

    + + + + + + + + + + +
    inputs

    Input layer

    outputs

    Output layer

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +

    Examples

    +
    # NOT RUN {
    +library(keras)
    +
    +# input layer
    +inputs <- layer_input(shape = c(784))
    +
    +# outputs compose input + dense layers
    +predictions <- inputs %>%
    +  layer_dense(units = 64, activation = 'relu') %>%
    +  layer_dense(units = 64, activation = 'relu') %>%
    +  layer_dense(units = 10, activation = 'softmax')
    +
    +# create and compile model
    +model <- keras_model(inputs = inputs, outputs = predictions)
    +model %>% compile(
    +  optimizer = 'rmsprop',
    +  loss = 'categorical_crossentropy',
    +  metrics = c('accuracy')
    +)
    +# }
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/keras_model_sequential.html b/website/reference/keras_model_sequential.html new file mode 100644 index 000000000..b0c8738ea --- /dev/null +++ b/website/reference/keras_model_sequential.html @@ -0,0 +1,218 @@ + + + + + + + + +Keras Model composed of a linear stack of layers — keras_model_sequential • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Keras Model composed of a linear stack of layers

    + + +
    keras_model_sequential(layers = NULL, name = NULL)
    + +

    Arguments

    + + + + + + + + + + +
    layers

    List of layers to add to the model

    name

    Name of model

    + +

    Note

    + +

    The first layer passed to a Sequential model should have a defined input +shape. What that means is that it should have received an input_shape or +batch_input_shape argument, or for some type of layers (recurrent, +Dense...) an input_dim argument.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +

    Examples

    +
    # NOT RUN {
    +
    +library(keras)
    +
    +model <- keras_model_sequential()
    +model %>%
    +  layer_dense(units = 32, input_shape = c(784)) %>%
    +  layer_activation('relu') %>%
    +  layer_dense(units = 10) %>%
    +  layer_activation('softmax')
    +
    +model %>% compile(
    +  optimizer = 'rmsprop',
    +  loss = 'categorical_crossentropy',
    +  metrics = c('accuracy')
    +)
    +# }
    +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_activation.html b/website/reference/layer_activation.html new file mode 100644 index 000000000..1701a07c7 --- /dev/null +++ b/website/reference/layer_activation.html @@ -0,0 +1,228 @@ + + + + + + + + +Apply an activation function to an output. — layer_activation • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Apply an activation function to an output.

    + + +
    layer_activation(object, activation, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    activation

    Name of activation function to use. If you don't specify +anything, no activation is applied (ie. "linear" activation: a(x) = x).

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Other core layers: layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_lambda, layer_masking, + layer_permute, + layer_repeat_vector, + layer_reshape

    +

    Other activation layers: layer_activation_elu, + layer_activation_leaky_relu, + layer_activation_parametric_relu, + layer_activation_thresholded_relu

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_activation_elu.html b/website/reference/layer_activation_elu.html new file mode 100644 index 000000000..9b604b260 --- /dev/null +++ b/website/reference/layer_activation_elu.html @@ -0,0 +1,222 @@ + + + + + + + + +Exponential Linear Unit. — layer_activation_elu • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It follows: f(x) = alpha * (exp(x) - 1.0) for x < 0, f(x) = x for `x

    +

    = 0`.

    + + +
    layer_activation_elu(object, alpha = 1, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    alpha

    Scale for the negative factor.

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Fast and Accurate Deep Network Learning by Exponential Linear Units(ELUs).

    +

    Other activation layers: layer_activation_leaky_relu, + layer_activation_parametric_relu, + layer_activation_thresholded_relu, + layer_activation

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_activation_leaky_relu.html b/website/reference/layer_activation_leaky_relu.html new file mode 100644 index 000000000..9e4ae3953 --- /dev/null +++ b/website/reference/layer_activation_leaky_relu.html @@ -0,0 +1,222 @@ + + + + + + + + +Leaky version of a Rectified Linear Unit. — layer_activation_leaky_relu • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Allows a small gradient when the unit is not active: f(x) = alpha * x for +x < 0, f(x) = x for x >= 0.

    + + +
    layer_activation_leaky_relu(object, alpha = 0.3, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    alpha

    float >= 0. Negative slope coefficient.

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Rectifier Nonlinearities Improve Neural Network AcousticModels.

    +

    Other activation layers: layer_activation_elu, + layer_activation_parametric_relu, + layer_activation_thresholded_relu, + layer_activation

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_activation_parametric_relu.html b/website/reference/layer_activation_parametric_relu.html new file mode 100644 index 000000000..5adc56d89 --- /dev/null +++ b/website/reference/layer_activation_parametric_relu.html @@ -0,0 +1,239 @@ + + + + + + + + +Parametric Rectified Linear Unit. — layer_activation_parametric_relu • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It follows: f(x) = alpha * x`` forx < 0,f(x) = xforx >= 0`, where +alpha is a learned array with the same shape as x.

    + + +
    layer_activation_parametric_relu(object, alpha_initializer = "zeros",
    +  alpha_regularizer = NULL, alpha_constraint = NULL, shared_axes = NULL,
    +  input_shape = NULL, batch_input_shape = NULL, batch_size = NULL,
    +  dtype = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    alpha_initializer

    Initializer function for the weights.

    alpha_regularizer

    Regularizer for the weights.

    alpha_constraint

    Constraint for the weights.

    shared_axes

    The axes along which to share learnable parameters for the +activation function. For example, if the incoming feature maps are from a +2D convolution with output shape (batch, height, width, channels), and you +wish to share parameters across space so that each filter only has one set +of parameters, set shared_axes=c(1, 2).

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Delving Deep into Rectifiers: Surpassing Human-Level Performance onImageNet Classification.

    +

    Other activation layers: layer_activation_elu, + layer_activation_leaky_relu, + layer_activation_thresholded_relu, + layer_activation

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_activation_thresholded_relu.html b/website/reference/layer_activation_thresholded_relu.html new file mode 100644 index 000000000..6de937d62 --- /dev/null +++ b/website/reference/layer_activation_thresholded_relu.html @@ -0,0 +1,221 @@ + + + + + + + + +Thresholded Rectified Linear Unit. — layer_activation_thresholded_relu • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It follows: f(x) = x for x > theta, f(x) = 0 otherwise.

    + + +
    layer_activation_thresholded_relu(object, theta = 1, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    theta

    float >= 0. Threshold location of activation.

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Zero-bias autoencoders and the benefits of co-adapting features.

    +

    Other activation layers: layer_activation_elu, + layer_activation_leaky_relu, + layer_activation_parametric_relu, + layer_activation

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_activity_regularization.html b/website/reference/layer_activity_regularization.html new file mode 100644 index 000000000..eaa6606b2 --- /dev/null +++ b/website/reference/layer_activity_regularization.html @@ -0,0 +1,241 @@ + + + + + + + + +Layer that applies an update to the cost function based input activity. — layer_activity_regularization • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Layer that applies an update to the cost function based input activity.

    + + +
    layer_activity_regularization(object, l1 = 0, l2 = 0, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    l1

    L1 regularization factor (positive float).

    l2

    L2 regularization factor (positive float).

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    Arbitrary. Use the keyword argument input_shape (list +of integers, does not include the samples axis) when using this layer as +the first layer in a model.

    + +

    Output shape

    + +

    Same shape as input.

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_lambda, layer_masking, + layer_permute, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_add.html b/website/reference/layer_add.html new file mode 100644 index 000000000..c0a702585 --- /dev/null +++ b/website/reference/layer_add.html @@ -0,0 +1,184 @@ + + + + + + + + +Layer that adds a list of inputs. — layer_add • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It takes as input a list of tensors, all of the same shape, and returns a +single tensor (also of the same shape).

    + + +
    layer_add(inputs)
    + +

    Arguments

    + + + + + + +
    inputs

    A list of input tensors (at least 2).

    + +

    Value

    + +

    A tensor, the sum of the inputs.

    + +

    See also

    + +

    Other merge layers: layer_average, + layer_concatenate, layer_dot, + layer_maximum, layer_multiply

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_alpha_dropout.html b/website/reference/layer_alpha_dropout.html new file mode 100644 index 000000000..3b15cdc84 --- /dev/null +++ b/website/reference/layer_alpha_dropout.html @@ -0,0 +1,221 @@ + + + + + + + + +Applies Alpha Dropout to the input. — layer_alpha_dropout • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Alpha Dropout is a dropout that keeps mean and variance of inputs to their +original values, in order to ensure the self-normalizing property even after +this dropout.

    + + +
    layer_alpha_dropout(object, rate, noise_shape = NULL, seed = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    rate

    float, drop probability (as with layer_dropout()). The +multiplicative noise will have standard deviation sqrt(rate / (1 - rate)).

    noise_shape

    Noise shape

    seed

    An integer to use as random seed.

    + +

    Details

    + +

    Alpha Dropout fits well to Scaled Exponential Linear Units by randomly +setting activations to the negative saturation value.

    + +

    Input shape

    + +

    Arbitrary. Use the keyword argument input_shape (list +of integers, does not include the samples axis) when using this layer as +the first layer in a model.

    + +

    Output shape

    + +

    Same shape as input.

    + +

    References

    + + + + +

    See also

    + +

    Other noise layers: layer_gaussian_dropout, + layer_gaussian_noise

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_average.html b/website/reference/layer_average.html new file mode 100644 index 000000000..29a12dd5c --- /dev/null +++ b/website/reference/layer_average.html @@ -0,0 +1,184 @@ + + + + + + + + +Layer that averages a list of inputs. — layer_average • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It takes as input a list of tensors, all of the same shape, and returns a +single tensor (also of the same shape).

    + + +
    layer_average(inputs)
    + +

    Arguments

    + + + + + + +
    inputs

    A list of input tensors (at least 2).

    + +

    Value

    + +

    A tensor, the average of the inputs.

    + +

    See also

    + +

    Other merge layers: layer_add, + layer_concatenate, layer_dot, + layer_maximum, layer_multiply

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_average_pooling_1d.html b/website/reference/layer_average_pooling_1d.html new file mode 100644 index 000000000..f0dbcc632 --- /dev/null +++ b/website/reference/layer_average_pooling_1d.html @@ -0,0 +1,230 @@ + + + + + + + + +Average pooling for temporal data. — layer_average_pooling_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Average pooling for temporal data.

    + + +
    layer_average_pooling_1d(object, pool_size = 2L, strides = NULL,
    +  padding = "valid", batch_size = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    pool_size

    Integer, size of the max pooling windows.

    strides

    Integer, or NULL. Factor by which to downscale. E.g. 2 will +halve the input. If NULL, it will default to pool_size.

    padding

    One of "valid" or "same" (case-insensitive).

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape: (batch_size, steps, features).

    + +

    Output shape

    + +

    3D tensor with shape: (batch_size, downsampled_steps, features).

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_average_pooling_2d.html b/website/reference/layer_average_pooling_2d.html new file mode 100644 index 000000000..09572f05d --- /dev/null +++ b/website/reference/layer_average_pooling_2d.html @@ -0,0 +1,249 @@ + + + + + + + + +Average pooling operation for spatial data. — layer_average_pooling_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Average pooling operation for spatial data.

    + + +
    layer_average_pooling_2d(object, pool_size = c(2L, 2L), strides = NULL,
    +  padding = "valid", data_format = NULL, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    pool_size

    integer or list of 2 integers, factors by which to downscale +(vertical, horizontal). (2, 2) will halve the input in both spatial +dimension. If only one integer is specified, the same window length will be +used for both dimensions.

    strides

    Integer, list of 2 integers, or NULL. Strides values. If NULL, +it will default to pool_size.

    padding

    One of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 4D tensor with shape: (batch_size, rows, cols, channels)

    • +
    • If data_format='channels_first': 4D tensor with shape: (batch_size, channels, rows, cols)

    • +
    + +

    Output shape

    + + +
      +
    • If data_format='channels_last': 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels)

    • +
    • If data_format='channels_first': 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols)

    • +
    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_average_pooling_3d.html b/website/reference/layer_average_pooling_3d.html new file mode 100644 index 000000000..1250c63f8 --- /dev/null +++ b/website/reference/layer_average_pooling_3d.html @@ -0,0 +1,248 @@ + + + + + + + + +Average pooling operation for 3D data (spatial or spatio-temporal). — layer_average_pooling_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Average pooling operation for 3D data (spatial or spatio-temporal).

    + + +
    layer_average_pooling_3d(object, pool_size = c(2L, 2L, 2L), strides = NULL,
    +  padding = "valid", data_format = NULL, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    pool_size

    list of 3 integers, factors by which to downscale (dim1, +dim2, dim3). (2, 2, 2) will halve the size of the 3D input in each +dimension.

    strides

    list of 3 integers, or NULL. Strides values.

    padding

    One of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)

    • +
    • If data_format='channels_first': 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)

    • +
    + +

    Output shape

    + + +
      +
    • If data_format='channels_last': 5D tensor with shape: (batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels)

    • +
    • If data_format='channels_first': 5D tensor with shape: (batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3)

    • +
    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_batch_normalization.html b/website/reference/layer_batch_normalization.html new file mode 100644 index 000000000..49df76aeb --- /dev/null +++ b/website/reference/layer_batch_normalization.html @@ -0,0 +1,294 @@ + + + + + + + + +Batch normalization layer (Ioffe and Szegedy, 2014). — layer_batch_normalization • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Normalize the activations of the previous layer at each batch, i.e. applies a +transformation that maintains the mean activation close to 0 and the +activation standard deviation close to 1.

    + + +
    layer_batch_normalization(object, axis = -1L, momentum = 0.99,
    +  epsilon = 0.001, center = TRUE, scale = TRUE,
    +  beta_initializer = "zeros", gamma_initializer = "ones",
    +  moving_mean_initializer = "zeros", moving_variance_initializer = "ones",
    +  beta_regularizer = NULL, gamma_regularizer = NULL,
    +  beta_constraint = NULL, gamma_constraint = NULL, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    axis

    Integer, the axis that should be normalized (typically the +features axis). For instance, after a Conv2D layer with +data_format="channels_first", set axis=1 in BatchNormalization.

    momentum

    Momentum for the moving average.

    epsilon

    Small float added to variance to avoid dividing by zero.

    center

    If TRUE, add offset of beta to normalized tensor. If FALSE, +beta is ignored.

    scale

    If TRUE, multiply by gamma. If FALSE, gamma is not used. +When the next layer is linear (also e.g. nn.relu), this can be disabled +since the scaling will be done by the next layer.

    beta_initializer

    Initializer for the beta weight.

    gamma_initializer

    Initializer for the gamma weight.

    moving_mean_initializer

    Initializer for the moving mean.

    moving_variance_initializer

    Initializer for the moving variance.

    beta_regularizer

    Optional regularizer for the beta weight.

    gamma_regularizer

    Optional regularizer for the gamma weight.

    beta_constraint

    Optional constraint for the beta weight.

    gamma_constraint

    Optional constraint for the gamma weight.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    Arbitrary. Use the keyword argument input_shape (list +of integers, does not include the samples axis) when using this layer as +the first layer in a model.

    + +

    Output shape

    + +

    Same shape as input.

    + +

    References

    + + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_concatenate.html b/website/reference/layer_concatenate.html new file mode 100644 index 000000000..6ba660d0c --- /dev/null +++ b/website/reference/layer_concatenate.html @@ -0,0 +1,189 @@ + + + + + + + + +Layer that concatenates a list of inputs. — layer_concatenate • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It takes as input a list of tensors, all of the same shape expect for the +concatenation axis, and returns a single tensor, the concatenation of all +inputs.

    + + +
    layer_concatenate(inputs, axis = -1L)
    + +

    Arguments

    + + + + + + + + + + +
    inputs

    A list of input tensors (at least 2).

    axis

    Concatenation axis.

    + +

    Value

    + +

    A tensor, the concatenation of the inputs alongside axis axis.

    + +

    See also

    + +

    Other merge layers: layer_add, + layer_average, layer_dot, + layer_maximum, layer_multiply

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_conv_1d.html b/website/reference/layer_conv_1d.html new file mode 100644 index 000000000..8579baaff --- /dev/null +++ b/website/reference/layer_conv_1d.html @@ -0,0 +1,323 @@ + + + + + + + + +1D convolution layer (e.g. temporal convolution). — layer_conv_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This layer creates a convolution kernel that is convolved with the layer +input over a single spatial (or temporal) dimension to produce a tensor of +outputs. If use_bias is TRUE, a bias vector is created and added to the +outputs. Finally, if activation is not NULL, it is applied to the outputs +as well. When using this layer as the first layer in a model, provide an +input_shape argument (list of integers or NULL, e.g. (10, 128) for +sequences of 10 vectors of 128-dimensional vectors, or (NULL, 128) for +variable-length sequences of 128-dimensional vectors.

    + + +
    layer_conv_1d(object, filters, kernel_size, strides = 1L, padding = "valid",
    +  dilation_rate = 1L, activation = NULL, use_bias = TRUE,
    +  kernel_initializer = "glorot_uniform", bias_initializer = "zeros",
    +  kernel_regularizer = NULL, bias_regularizer = NULL,
    +  activity_regularizer = NULL, kernel_constraint = NULL,
    +  bias_constraint = NULL, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of a single integer, specifying the +length of the 1D convolution window.

    strides

    An integer or list of a single integer, specifying the stride +length of the convolution. Specifying any stride value != 1 is incompatible +with specifying any dilation_rate value != 1.

    padding

    One of "valid", "causal" or "same" (case-insensitive). +"valid" means "no padding". +"same" results in padding the input such that the output has the same +length as the original input. +"causal" results in causal (dilated) convolutions, e.g. output[t] does +not depend on input[t+1:]. Useful when modeling temporal data where the +model should not violate the temporal order. See WaveNet: A GenerativeModel for Raw Audio, section 2.1.

    dilation_rate

    an integer or list of a single integer, specifying the +dilation rate to use for dilated convolution. Currently, specifying any +dilation_rate value != 1 is incompatible with specifying any strides +value != 1.

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape: (batch_size, steps, input_dim)

    + +

    Output shape

    + +

    3D tensor with shape: (batch_size, new_steps, filters) steps value might have changed due to padding or strides.

    + +

    See also

    + +

    Other convolutional layers: layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_conv_2d.html b/website/reference/layer_conv_2d.html new file mode 100644 index 000000000..ba793b399 --- /dev/null +++ b/website/reference/layer_conv_2d.html @@ -0,0 +1,331 @@ + + + + + + + + +2D convolution layer (e.g. spatial convolution over images). — layer_conv_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This layer creates a convolution kernel that is convolved with the layer +input to produce a tensor of outputs. If use_bias is TRUE, a bias vector is +created and added to the outputs. Finally, if activation is not NULL, it +is applied to the outputs as well. When using this layer as the first layer +in a model, provide the keyword argument input_shape (list of integers, +does not include the sample axis), e.g. input_shape=c(128, 128, 3) for +128x128 RGB pictures in data_format="channels_last".

    + + +
    layer_conv_2d(object, filters, kernel_size, strides = c(1L, 1L),
    +  padding = "valid", data_format = NULL, dilation_rate = c(1L, 1L),
    +  activation = NULL, use_bias = TRUE,
    +  kernel_initializer = "glorot_uniform", bias_initializer = "zeros",
    +  kernel_regularizer = NULL, bias_regularizer = NULL,
    +  activity_regularizer = NULL, kernel_constraint = NULL,
    +  bias_constraint = NULL, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of 2 integers, specifying the width and +height of the 2D convolution window. Can be a single integer to specify the +same value for all spatial dimensions.

    strides

    An integer or list of 2 integers, specifying the strides of +the convolution along the width and height. Can be a single integer to +specify the same value for all spatial dimensions. Specifying any stride +value != 1 is incompatible with specifying any dilation_rate value != 1.

    padding

    one of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    dilation_rate

    an integer or list of 2 integers, specifying the +dilation rate to use for dilated convolution. Can be a single integer to +specify the same value for all spatial dimensions. Currently, specifying +any dilation_rate value != 1 is incompatible with specifying any stride +value != 1.

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape: (samples, channels, rows, cols) +if data_format='channels_first' or 4D tensor with shape: (samples, rows, cols, channels) if data_format='channels_last'.

    + +

    Output shape

    + +

    4D tensor with shape: (samples, filters, new_rows, new_cols) if data_format='channels_first' or 4D tensor with shape: +(samples, new_rows, new_cols, filters) if data_format='channels_last'. +rows and cols values might have changed due to padding.

    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_conv_2d_transpose.html b/website/reference/layer_conv_2d_transpose.html new file mode 100644 index 000000000..fba69a5b0 --- /dev/null +++ b/website/reference/layer_conv_2d_transpose.html @@ -0,0 +1,332 @@ + + + + + + + + +Transposed 2D convolution layer (sometimes called Deconvolution). — layer_conv_2d_transpose • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    The need for transposed convolutions generally arises from the desire to use +a transformation going in the opposite direction of a normal convolution, +i.e., from something that has the shape of the output of some convolution to +something that has the shape of its input while maintaining a connectivity +pattern that is compatible with said convolution. When using this layer as +the first layer in a model, provide the keyword argument input_shape (list +of integers, does not include the sample axis), e.g. input_shape=c(128L, 128L, 3L) for 128x128 RGB pictures in data_format="channels_last".

    + + +
    layer_conv_2d_transpose(object, filters, kernel_size, strides = c(1L, 1L),
    +  padding = "valid", data_format = NULL, activation = NULL,
    +  use_bias = TRUE, kernel_initializer = "glorot_uniform",
    +  bias_initializer = "zeros", kernel_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  kernel_constraint = NULL, bias_constraint = NULL, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number of output filters in the convolution).

    kernel_size

    An integer or list of 2 integers, specifying the width and +height of the 2D convolution window. Can be a single integer to specify the +same value for all spatial dimensions.

    strides

    An integer or list of 2 integers, specifying the strides of +the convolution along the width and height. Can be a single integer to +specify the same value for all spatial dimensions. Specifying any stride +value != 1 is incompatible with specifying any dilation_rate value != 1.

    padding

    one of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape: (batch, channels, rows, cols) +if data_format='channels_first' or 4D tensor with shape: (batch, rows, cols, channels) if data_format='channels_last'.

    + +

    Output shape

    + +

    4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format='channels_first' or 4D tensor with shape: +(batch, new_rows, new_cols, filters) if data_format='channels_last'. +rows and cols values might have changed due to padding.

    + +

    References

    + + + + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_conv_3d.html b/website/reference/layer_conv_3d.html new file mode 100644 index 000000000..58e0e405d --- /dev/null +++ b/website/reference/layer_conv_3d.html @@ -0,0 +1,335 @@ + + + + + + + + +3D convolution layer (e.g. spatial convolution over volumes). — layer_conv_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This layer creates a convolution kernel that is convolved with the layer +input to produce a tensor of outputs. If use_bias is TRUE, a bias vector is +created and added to the outputs. Finally, if activation is not NULL, it +is applied to the outputs as well. When using this layer as the first layer +in a model, provide the keyword argument input_shape (list of integers, +does not include the sample axis), e.g. input_shape=c(128L, 128L, 128L, 3L) +for 128x128x128 volumes with a single channel, in +data_format="channels_last".

    + + +
    layer_conv_3d(object, filters, kernel_size, strides = c(1L, 1L, 1L),
    +  padding = "valid", data_format = NULL, dilation_rate = c(1L, 1L, 1L),
    +  activation = NULL, use_bias = TRUE,
    +  kernel_initializer = "glorot_uniform", bias_initializer = "zeros",
    +  kernel_regularizer = NULL, bias_regularizer = NULL,
    +  activity_regularizer = NULL, kernel_constraint = NULL,
    +  bias_constraint = NULL, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of 3 integers, specifying the depth, +height, and width of the 3D convolution window. Can be a single integer +to specify the same value for all spatial dimensions.

    strides

    An integer or list of 3 integers, specifying the strides of +the convolution along each spatial dimension. Can be a single integer to +specify the same value for all spatial dimensions. Specifying any stride +value != 1 is incompatible with specifying any dilation_rate value != 1.

    padding

    one of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    dilation_rate

    an integer or list of 3 integers, specifying the +dilation rate to use for dilated convolution. Can be a single integer to +specify the same value for all spatial dimensions. Currently, specifying +any dilation_rate value != 1 is incompatible with specifying any stride +value != 1.

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    5D tensor with shape: (samples, channels, conv_dim1, conv_dim2, conv_dim3) if data_format='channels_first' or 5D tensor with +shape: (samples, conv_dim1, conv_dim2, conv_dim3, channels) if +data_format='channels_last'.

    + +

    Output shape

    + +

    5D tensor with shape: (samples, filters, new_conv_dim1, new_conv_dim2, new_conv_dim3) if +data_format='channels_first' or 5D tensor with shape: (samples, new_conv_dim1, new_conv_dim2, new_conv_dim3, filters) if +data_format='channels_last'. new_conv_dim1, new_conv_dim2 and +new_conv_dim3 values might have changed due to padding.

    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_conv_3d_transpose.html b/website/reference/layer_conv_3d_transpose.html new file mode 100644 index 000000000..e79647d1a --- /dev/null +++ b/website/reference/layer_conv_3d_transpose.html @@ -0,0 +1,325 @@ + + + + + + + + +Transposed 3D convolution layer (sometimes called Deconvolution). — layer_conv_3d_transpose • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    The need for transposed convolutions generally arises from the desire to use +a transformation going in the opposite direction of a normal convolution, +i.e., from something that has the shape of the output of some convolution to +something that has the shape of its input while maintaining a connectivity +pattern that is compatible with said convolution.

    + + +
    layer_conv_3d_transpose(object, filters, kernel_size, strides = c(1, 1, 1),
    +  padding = "valid", data_format = NULL, activation = NULL,
    +  use_bias = TRUE, kernel_initializer = "glorot_uniform",
    +  bias_initializer = "zeros", kernel_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  kernel_constraint = NULL, bias_constraint = NULL, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number of output filters in the convolution).

    kernel_size

    An integer or list of 3 integers, specifying the depth, +height, and width of the 3D convolution window. Can be a single integer +to specify the same value for all spatial dimensions.

    strides

    An integer or list of 3 integers, specifying the strides of +the convolution along the depth, height and width.. Can be a single integer +to specify the same value for all spatial dimensions. Specifying any stride +value != 1 is incompatible with specifying any dilation_rate value != 1.

    padding

    one of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, depth, height, width, channels) while channels_first corresponds to inputs with shape +(batch, channels, depth, height, width). It defaults to the +image_data_format value found in your Keras config file at +~/.keras/keras.json. If you never set it, then it will be +"channels_last".

    activation

    Activation function to use. If you don't specify anything, no +activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix,

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation").

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Details

    + +

    When using this layer as the first layer in a model, provide the keyword argument +input_shape (list of integers, does not include the sample axis), e.g. +input_shape = list(128, 128, 128, 3) for a 128x128x128 volume with 3 channels if +data_format="channels_last".

    + +

    References

    + + + + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_conv_lstm_2d.html b/website/reference/layer_conv_lstm_2d.html new file mode 100644 index 000000000..69e6aed8c --- /dev/null +++ b/website/reference/layer_conv_lstm_2d.html @@ -0,0 +1,377 @@ + + + + + + + + +Convolutional LSTM. — layer_conv_lstm_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It is similar to an LSTM layer, but the input transformations and recurrent +transformations are both convolutional.

    + + +
    layer_conv_lstm_2d(object, filters, kernel_size, strides = c(1L, 1L),
    +  padding = "valid", data_format = NULL, dilation_rate = c(1L, 1L),
    +  activation = "tanh", recurrent_activation = "hard_sigmoid",
    +  use_bias = TRUE, kernel_initializer = "glorot_uniform",
    +  recurrent_initializer = "orthogonal", bias_initializer = "zeros",
    +  unit_forget_bias = TRUE, kernel_regularizer = NULL,
    +  recurrent_regularizer = NULL, bias_regularizer = NULL,
    +  activity_regularizer = NULL, kernel_constraint = NULL,
    +  recurrent_constraint = NULL, bias_constraint = NULL,
    +  return_sequences = FALSE, go_backwards = FALSE, stateful = FALSE,
    +  dropout = 0, recurrent_dropout = 0, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL, input_shape = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of n integers, specifying the +dimensions of the convolution window.

    strides

    An integer or list of n integers, specifying the strides of +the convolution. Specifying any stride value != 1 is incompatible with +specifying any dilation_rate value != 1.

    padding

    One of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, time, ..., channels) while channels_first corresponds to inputs with shape (batch, time, channels, ...). It defaults to the image_data_format value found +in your Keras config file at ~/.keras/keras.json. If you never set it, +then it will be "channels_last".

    dilation_rate

    An integer or list of n integers, specifying the +dilation rate to use for dilated convolution. Currently, specifying any +dilation_rate value != 1 is incompatible with specifying any strides +value != 1.

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    recurrent_activation

    Activation function to use for the recurrent +step.

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix, used +for the linear transformation of the inputs..

    recurrent_initializer

    Initializer for the recurrent_kernel weights +matrix, used for the linear transformation of the recurrent state..

    bias_initializer

    Initializer for the bias vector.

    unit_forget_bias

    Boolean. If TRUE, add 1 to the bias of the forget +gate at initialization. Use in combination with bias_initializer="zeros". +This is recommended in Jozefowicz etal.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    recurrent_regularizer

    Regularizer function applied to the +recurrent_kernel weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel weights +matrix.

    recurrent_constraint

    Constraint function applied to the +recurrent_kernel weights matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    return_sequences

    Boolean. Whether to return the last output in the +output sequence, or the full sequence.

    go_backwards

    Boolean (default FALSE). If TRUE, rocess the input +sequence backwards.

    stateful

    Boolean (default FALSE). If TRUE, the last state for each +sample at index i in a batch will be used as initial state for the sample +of index i in the following batch.

    dropout

    Float between 0 and 1. Fraction of the units to drop for the +linear transformation of the inputs.

    recurrent_dropout

    Float between 0 and 1. Fraction of the units to drop +for the linear transformation of the recurrent state.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    + +

    Input shape

    + + +
      +
    • if data_format='channels_first' 5D tensor with shape: +(samples,time, channels, rows, cols)

        +
      • if data_format='channels_last' 5D +tensor with shape: (samples,time, rows, cols, channels)

      • +
    • +
    + +

    References

    + + + + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_cropping_1d.html b/website/reference/layer_cropping_1d.html new file mode 100644 index 000000000..61d2d2a50 --- /dev/null +++ b/website/reference/layer_cropping_1d.html @@ -0,0 +1,226 @@ + + + + + + + + +Cropping layer for 1D input (e.g. temporal sequence). — layer_cropping_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It crops along the time dimension (axis 1).

    + + +
    layer_cropping_1d(object, cropping = c(1L, 1L), batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    cropping

    int or list of int (length 2) How many units should be +trimmed off at the beginning and end of the cropping dimension (axis 1). If +a single int is provided, the same value will be used for both.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape (batch, axis_to_crop, features)

    + +

    Output shape

    + +

    3D tensor with shape (batch, cropped_axis, features)

    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_cropping_2d.html b/website/reference/layer_cropping_2d.html new file mode 100644 index 000000000..2c786688d --- /dev/null +++ b/website/reference/layer_cropping_2d.html @@ -0,0 +1,244 @@ + + + + + + + + +Cropping layer for 2D input (e.g. picture). — layer_cropping_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It crops along spatial dimensions, i.e. width and height.

    + + +
    layer_cropping_2d(object, cropping = list(c(0L, 0L), c(0L, 0L)),
    +  data_format = NULL, batch_size = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    cropping

    int, or list of 2 ints, or list of 2 lists of 2 ints.

      +
    • If int: the same symmetric cropping is applied to width and height.

    • +
    • If list of 2 ints: interpreted as two different symmetric cropping values for +height and width: (symmetric_height_crop, symmetric_width_crop).

    • +
    • If list of 2 lists of 2 ints: interpreted as ((top_crop, bottom_crop), (left_crop, right_crop))

    • +
    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape:

      +
    • If data_format is "channels_last": (batch, rows, cols, channels)

    • +
    • If data_format is "channels_first": (batch, channels, rows, cols)

    • +
    + +

    Output shape

    + +

    4D tensor with shape:

      +
    • If data_format is "channels_last": (batch, cropped_rows, cropped_cols, channels)

    • +
    • If data_format is "channels_first": (batch, channels, cropped_rows, cropped_cols)

    • +
    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_cropping_3d.html b/website/reference/layer_cropping_3d.html new file mode 100644 index 000000000..a30577de7 --- /dev/null +++ b/website/reference/layer_cropping_3d.html @@ -0,0 +1,247 @@ + + + + + + + + +Cropping layer for 3D data (e.g. spatial or spatio-temporal). — layer_cropping_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Cropping layer for 3D data (e.g. spatial or spatio-temporal).

    + + +
    layer_cropping_3d(object, cropping = list(c(1L, 1L), c(1L, 1L), c(1L, 1L)),
    +  data_format = NULL, batch_size = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    cropping

    int, or list of 3 ints, or list of 3 lists of 2 ints.

      +
    • If int: the same symmetric cropping is applied to width and height.

    • +
    • If list of 3 ints: interpreted as two different symmetric cropping values for +height and width: (symmetric_dim1_crop, symmetric_dim2_crop, symmetric_dim3_crop).

    • +
    • If list of 3 lists of 2 ints: interpreted as +((left_dim1_crop, right_dim1_crop), (left_dim2_crop, right_dim2_crop), (left_dim3_crop, right_dim3_crop))

    • +
    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    5D tensor with shape:

      +
    • If data_format is "channels_last": (batch, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop, depth)

    • +
    • If data_format is "channels_first": +(batch, depth, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop)

    • +
    + +

    Output shape

    + +

    5D tensor with shape:

      +
    • If data_format is "channels_last": (batch, first_cropped_axis, second_cropped_axis, third_cropped_axis, depth)

    • +
    • If data_format is "channels_first": (batch, depth, first_cropped_axis, second_cropped_axis, third_cropped_axis)

    • +
    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_dense.html b/website/reference/layer_dense.html new file mode 100644 index 000000000..94df0df57 --- /dev/null +++ b/website/reference/layer_dense.html @@ -0,0 +1,283 @@ + + + + + + + + +Add a densely-connected NN layer to an output — layer_dense • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Implements the operation: output = activation(dot(input, kernel) + bias) +where activation is the element-wise activation function passed as the +activation argument, kernel is a weights matrix created by the layer, and +bias is a bias vector created by the layer (only applicable if use_bias +is TRUE). Note: if the input to the layer has a rank greater than 2, then +it is flattened prior to the initial dot product with kernel.

    + + +
    layer_dense(object, units, activation = NULL, use_bias = TRUE,
    +  kernel_initializer = "glorot_uniform", bias_initializer = "zeros",
    +  kernel_regularizer = NULL, bias_regularizer = NULL,
    +  activity_regularizer = NULL, kernel_constraint = NULL,
    +  bias_constraint = NULL, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    units

    Positive integer, dimensionality of the output space.

    activation

    Name of activation function to use. If you don't specify +anything, no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel weights +matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input and Output Shapes

    + + +

    Input shape: nD tensor with shape: (batch_size, ..., input_dim). The most +common situation would be a 2D input with shape (batch_size, input_dim).

    +

    Output shape: nD tensor with shape: (batch_size, ..., units). For +instance, for a 2D input with shape (batch_size, input_dim), the output +would have shape (batch_size, unit).

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dropout, layer_flatten, + layer_input, layer_lambda, + layer_masking, layer_permute, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_dot.html b/website/reference/layer_dot.html new file mode 100644 index 000000000..80c3dff10 --- /dev/null +++ b/website/reference/layer_dot.html @@ -0,0 +1,192 @@ + + + + + + + + +Layer that computes a dot product between samples in two tensors. — layer_dot • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Layer that computes a dot product between samples in two tensors.

    + + +
    layer_dot(inputs, axes, normalize = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    inputs

    A list of input tensors (at least 2).

    axes

    Integer or list of integers, axis or axes along which to take the dot product.

    normalize

    Whether to L2-normalize samples along the dot product axis before taking the dot product. If set to TRUE, then the output of the dot product is the cosine proximity between the two samples. **kwargs: Standard layer keyword arguments.

    + +

    Value

    + +

    A tensor, the dot product of the samples from the inputs.

    + +

    See also

    + +

    Other merge layers: layer_add, + layer_average, + layer_concatenate, + layer_maximum, layer_multiply

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_dropout.html b/website/reference/layer_dropout.html new file mode 100644 index 000000000..e8825a276 --- /dev/null +++ b/website/reference/layer_dropout.html @@ -0,0 +1,220 @@ + + + + + + + + +Applies Dropout to the input. — layer_dropout • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Dropout consists in randomly setting a fraction rate of input units to 0 at +each update during training time, which helps prevent overfitting.

    + + +
    layer_dropout(object, rate, noise_shape = NULL, seed = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    rate

    float between 0 and 1. Fraction of the input units to drop.

    noise_shape

    1D integer tensor representing the shape of the binary +dropout mask that will be multiplied with the input. For instance, if your +inputs have shape (batch_size, timesteps, features) and you want the +dropout mask to be the same for all timesteps, you can use +noise_shape=c(batch_size, 1, features).

    seed

    A Python integer to use as random seed.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_flatten, + layer_input, layer_lambda, + layer_masking, layer_permute, + layer_repeat_vector, + layer_reshape

    +

    Other dropout layers: layer_spatial_dropout_1d, + layer_spatial_dropout_2d, + layer_spatial_dropout_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_embedding.html b/website/reference/layer_embedding.html new file mode 100644 index 000000000..a270a3e4f --- /dev/null +++ b/website/reference/layer_embedding.html @@ -0,0 +1,256 @@ + + + + + + + + +Turns positive integers (indexes) into dense vectors of fixed size. — layer_embedding • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    For example, list(4L, 20L) -> list(c(0.25, 0.1), c(0.6, -0.2)) This layer +can only be used as the first layer in a model.

    + + +
    layer_embedding(object, input_dim, output_dim,
    +  embeddings_initializer = "uniform", embeddings_regularizer = NULL,
    +  activity_regularizer = NULL, embeddings_constraint = NULL,
    +  mask_zero = FALSE, input_length = NULL, batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    input_dim

    int > 0. Size of the vocabulary, i.e. maximum integer +index + 1.

    output_dim

    int >= 0. Dimension of the dense embedding.

    embeddings_initializer

    Initializer for the embeddings matrix.

    embeddings_regularizer

    Regularizer function applied to the +embeddings matrix.

    activity_regularizer

    activity_regularizer

    embeddings_constraint

    Constraint function applied to the embeddings +matrix.

    mask_zero

    Whether or not the input value 0 is a special "padding" +value that should be masked out. This is useful when using recurrent +layers, which may take variable length inputs. If this is TRUE then all +subsequent layers in the model need to support masking or an exception will +be raised. If mask_zero is set to TRUE, as a consequence, index 0 cannot be +used in the vocabulary (input_dim should equal size of vocabulary + 1).

    input_length

    Length of input sequences, when it is constant. This +argument is required if you are going to connect Flatten then Dense +layers upstream (without it, the shape of the dense outputs cannot be +computed).

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    2D tensor with shape: (batch_size, sequence_length).

    + +

    Output shape

    + +

    3D tensor with shape: (batch_size, sequence_length, output_dim).

    + +

    References

    + + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_flatten.html b/website/reference/layer_flatten.html new file mode 100644 index 000000000..787fe512f --- /dev/null +++ b/website/reference/layer_flatten.html @@ -0,0 +1,200 @@ + + + + + + + + +Flattens an input — layer_flatten • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Flatten a given input, does not affect the batch size.

    + + +
    layer_flatten(object, batch_size = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_input, layer_lambda, + layer_masking, layer_permute, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_gaussian_dropout.html b/website/reference/layer_gaussian_dropout.html new file mode 100644 index 000000000..3fb3b2c42 --- /dev/null +++ b/website/reference/layer_gaussian_dropout.html @@ -0,0 +1,242 @@ + + + + + + + + +Apply multiplicative 1-centered Gaussian noise. — layer_gaussian_dropout • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    As it is a regularization layer, it is only active at training time.

    + + +
    layer_gaussian_dropout(object, rate, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    rate

    float, drop probability (as with Dropout). The multiplicative +noise will have standard deviation sqrt(rate / (1 - rate)).

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    Arbitrary. Use the keyword argument input_shape (list +of integers, does not include the samples axis) when using this layer as +the first layer in a model.

    + +

    Output shape

    + +

    Same shape as input.

    + +

    References

    + + + + +

    See also

    + +

    Other noise layers: layer_alpha_dropout, + layer_gaussian_noise

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_gaussian_noise.html b/website/reference/layer_gaussian_noise.html new file mode 100644 index 000000000..b3b39a32c --- /dev/null +++ b/website/reference/layer_gaussian_noise.html @@ -0,0 +1,235 @@ + + + + + + + + +Apply additive zero-centered Gaussian noise. — layer_gaussian_noise • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This is useful to mitigate overfitting (you could see it as a form of random +data augmentation). Gaussian Noise (GS) is a natural choice as corruption +process for real valued inputs. As it is a regularization layer, it is only +active at training time.

    + + +
    layer_gaussian_noise(object, stddev, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    stddev

    float, standard deviation of the noise distribution.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    Arbitrary. Use the keyword argument input_shape (list +of integers, does not include the samples axis) when using this layer as +the first layer in a model.

    + +

    Output shape

    + +

    Same shape as input.

    + +

    See also

    + +

    Other noise layers: layer_alpha_dropout, + layer_gaussian_dropout

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_global_average_pooling_1d.html b/website/reference/layer_global_average_pooling_1d.html new file mode 100644 index 000000000..0f7fb0840 --- /dev/null +++ b/website/reference/layer_global_average_pooling_1d.html @@ -0,0 +1,218 @@ + + + + + + + + +Global average pooling operation for temporal data. — layer_global_average_pooling_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Global average pooling operation for temporal data.

    + + +
    layer_global_average_pooling_1d(object, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +

    3D tensor with shape: (batch_size, steps, features).

    + +

    Output shape

    + + +

    2D tensor with shape: (batch_size, channels)

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_global_average_pooling_2d.html b/website/reference/layer_global_average_pooling_2d.html new file mode 100644 index 000000000..22962f30e --- /dev/null +++ b/website/reference/layer_global_average_pooling_2d.html @@ -0,0 +1,228 @@ + + + + + + + + +Global average pooling operation for spatial data. — layer_global_average_pooling_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Global average pooling operation for spatial data.

    + + +
    layer_global_average_pooling_2d(object, data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 4D tensor with shape: (batch_size, rows, cols, channels)

    • +
    • If data_format='channels_first': 4D tensor with shape: (batch_size, channels, rows, cols)

    • +
    + +

    Output shape

    + +

    2D tensor with shape: (batch_size, channels)

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_global_average_pooling_3d.html b/website/reference/layer_global_average_pooling_3d.html new file mode 100644 index 000000000..7c8479ca5 --- /dev/null +++ b/website/reference/layer_global_average_pooling_3d.html @@ -0,0 +1,229 @@ + + + + + + + + +Global Average pooling operation for 3D data. — layer_global_average_pooling_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Global Average pooling operation for 3D data.

    + + +
    layer_global_average_pooling_3d(object, data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)

    • +
    • If data_format='channels_first': 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)

    • +
    + +

    Output shape

    + +

    2D tensor with shape: (batch_size, channels)

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_global_max_pooling_1d.html b/website/reference/layer_global_max_pooling_1d.html new file mode 100644 index 000000000..e0c4b616e --- /dev/null +++ b/website/reference/layer_global_max_pooling_1d.html @@ -0,0 +1,218 @@ + + + + + + + + +Global max pooling operation for temporal data. — layer_global_max_pooling_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Global max pooling operation for temporal data.

    + + +
    layer_global_max_pooling_1d(object, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +

    3D tensor with shape: (batch_size, steps, features).

    + +

    Output shape

    + + +

    2D tensor with shape: (batch_size, channels)

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_global_max_pooling_2d.html b/website/reference/layer_global_max_pooling_2d.html new file mode 100644 index 000000000..8de53c079 --- /dev/null +++ b/website/reference/layer_global_max_pooling_2d.html @@ -0,0 +1,228 @@ + + + + + + + + +Global max pooling operation for spatial data. — layer_global_max_pooling_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Global max pooling operation for spatial data.

    + + +
    layer_global_max_pooling_2d(object, data_format = NULL, batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 4D tensor with shape: (batch_size, rows, cols, channels)

    • +
    • If data_format='channels_first': 4D tensor with shape: (batch_size, channels, rows, cols)

    • +
    + +

    Output shape

    + +

    2D tensor with shape: (batch_size, channels)

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_global_max_pooling_3d.html b/website/reference/layer_global_max_pooling_3d.html new file mode 100644 index 000000000..15328f068 --- /dev/null +++ b/website/reference/layer_global_max_pooling_3d.html @@ -0,0 +1,229 @@ + + + + + + + + +Global Max pooling operation for 3D data. — layer_global_max_pooling_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Global Max pooling operation for 3D data.

    + + +
    layer_global_max_pooling_3d(object, data_format = NULL, batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)

    • +
    • If data_format='channels_first': 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)

    • +
    + +

    Output shape

    + +

    2D tensor with shape: (batch_size, channels)

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_max_pooling_1d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_gru.html b/website/reference/layer_gru.html new file mode 100644 index 000000000..78508bdb6 --- /dev/null +++ b/website/reference/layer_gru.html @@ -0,0 +1,415 @@ + + + + + + + + +Gated Recurrent Unit - Cho et al. — layer_gru • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Gated Recurrent Unit - Cho et al.

    + + +
    layer_gru(object, units, activation = "tanh",
    +  recurrent_activation = "hard_sigmoid", use_bias = TRUE,
    +  return_sequences = FALSE, return_state = FALSE, go_backwards = FALSE,
    +  stateful = FALSE, unroll = FALSE, implementation = 0L,
    +  kernel_initializer = "glorot_uniform",
    +  recurrent_initializer = "orthogonal", bias_initializer = "zeros",
    +  kernel_regularizer = NULL, recurrent_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  kernel_constraint = NULL, recurrent_constraint = NULL,
    +  bias_constraint = NULL, dropout = 0, recurrent_dropout = 0,
    +  input_shape = NULL, batch_input_shape = NULL, batch_size = NULL,
    +  dtype = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    units

    Positive integer, dimensionality of the output space.

    activation

    Activation function to use. If you pass NULL, no +activation is applied (ie. "linear" activation: a(x) = x).

    recurrent_activation

    Activation function to use for the recurrent +step.

    use_bias

    Boolean, whether the layer uses a bias vector.

    return_sequences

    Boolean. Whether to return the last output in the +output sequence, or the full sequence.

    return_state

    Boolean (default FALSE). Whether to return the last state +in addition to the output.

    go_backwards

    Boolean (default FALSE). If TRUE, process the input +sequence backwards and return the reversed sequence.

    stateful

    Boolean (default FALSE). If TRUE, the last state for each +sample at index i in a batch will be used as initial state for the sample +of index i in the following batch.

    unroll

    Boolean (default FALSE). If TRUE, the network will be unrolled, +else a symbolic loop will be used. Unrolling can speed-up a RNN, although +it tends to be more memory-intensive. Unrolling is only suitable for short +sequences.

    implementation

    one of 0, 1, or 2. If set to 0, the RNN will use an +implementation that uses fewer, larger matrix products, thus running faster +on CPU but consuming more memory. If set to 1, the RNN will use more matrix +products, but smaller ones, thus running slower (may actually be faster on +GPU) while consuming less memory. If set to 2 (LSTM/GRU only), the RNN will +combine the input gate, the forget gate and the output gate into a single +matrix, enabling more time-efficient parallelization on the GPU.

    kernel_initializer

    Initializer for the kernel weights matrix, used +for the linear transformation of the inputs..

    recurrent_initializer

    Initializer for the recurrent_kernel weights +matrix, used for the linear transformation of the recurrent state..

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    recurrent_regularizer

    Regularizer function applied to the +recurrent_kernel weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel weights +matrix.

    recurrent_constraint

    Constraint function applied to the +recurrent_kernel weights matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    dropout

    Float between 0 and 1. Fraction of the units to drop for the +linear transformation of the inputs.

    recurrent_dropout

    Float between 0 and 1. Fraction of the units to drop +for the linear transformation of the recurrent state.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shapes

    + + +

    3D tensor with shape (batch_size, timesteps, input_dim), +(Optional) 2D tensors with shape (batch_size, output_dim).

    + +

    Output shape

    + + +
      +
    • if return_state: a list of tensors. The first tensor is +the output. The remaining tensors are the last states, +each with shape (batch_size, units).

    • +
    • if return_sequences: 3D tensor with shape +(batch_size, timesteps, units).

    • +
    • else, 2D tensor with shape (batch_size, units).

    • +
    + +

    Masking

    + + +

    This layer supports masking for input data with a variable number +of timesteps. To introduce masks to your data, +use an embedding layer with the mask_zero parameter +set to TRUE.

    + +

    Statefulness in RNNs

    + + +

    You can set RNN layers to be 'stateful', which means that the states +computed for the samples in one batch will be reused as initial states +for the samples in the next batch. This assumes a one-to-one mapping +between samples in different successive batches.

    +

    To enable statefulness:

      +
    • Specify stateful=TRUE in the layer constructor.

    • +
    • Specify a fixed batch size for your model. For sequential models, +pass batch_input_shape = c(...) to the first layer in your model. +For functional models with 1 or more Input layers, pass +batch_shape = c(...) to all the first layers in your model. +This is the expected shape of your inputs including the batch size. +It should be a vector of integers, e.g. c(32, 10, 100).

    • +
    • Specify shuffle = FALSE when calling fit().

    • +
    +

    To reset the states of your model, call reset_states() on either +a specific layer, or on your entire model.

    + +

    Initial State of RNNs

    + + +

    You can specify the initial state of RNN layers symbolically by calling +them with the keyword argument initial_state. The value of +initial_state should be a tensor or list of tensors representing +the initial state of the RNN layer.

    +

    You can specify the initial state of RNN layers numerically by +calling reset_states with the keyword argument states. The value of +states should be a numpy array or list of numpy arrays representing +the initial state of the RNN layer.

    + +

    References

    + + + + +

    See also

    + +

    Other recurrent layers: layer_lstm, + layer_simple_rnn

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_input.html b/website/reference/layer_input.html new file mode 100644 index 000000000..81847044c --- /dev/null +++ b/website/reference/layer_input.html @@ -0,0 +1,225 @@ + + + + + + + + +Input layer — layer_input • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Layer to be used as an entry point into a graph.

    + + +
    layer_input(shape = NULL, batch_shape = NULL, name = NULL, dtype = NULL,
    +  sparse = FALSE, tensor = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    shape

    Shape, not including the batch size. For instance, +shape=c(32) indicates that the expected input will be batches +of 32-dimensional vectors.

    batch_shape

    Shapes, including the batch size. For instance, +batch_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    sparse

    Boolean, whether the placeholder created is meant to be sparse.

    tensor

    Existing tensor to wrap into the Input layer. If set, the +layer will not create a placeholder tensor.

    + +

    Value

    + +

    A tensor

    + +

    Details

    + +

    It can either wrap an existing tensor (pass an input_tensor +argument) or create its a placeholder tensor (pass arguments input_shape +or batch_input_shape as well as input_dtype).

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_lambda, + layer_masking, layer_permute, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_lambda.html b/website/reference/layer_lambda.html new file mode 100644 index 000000000..a6560bcff --- /dev/null +++ b/website/reference/layer_lambda.html @@ -0,0 +1,253 @@ + + + + + + + + +Wraps arbitrary expression as a layer — layer_lambda • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Wraps arbitrary expression as a layer

    + + +
    layer_lambda(object, f, output_shape = NULL, mask = NULL,
    +  arguments = NULL, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    f

    The function to be evaluated. Takes input tensor as first +argument.

    output_shape

    Expected output shape from the function (not required +when using TensorFlow back-end).

    mask

    mask

    arguments

    optional named list of keyword arguments to be passed to the +function.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    Arbitrary. Use the keyword argument input_shape (list +of integers, does not include the samples axis) when using this layer as +the first layer in a model.

    + +

    Output shape

    + +

    Arbitrary (based on tensor returned from the function)

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_masking, layer_permute, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_locally_connected_1d.html b/website/reference/layer_locally_connected_1d.html new file mode 100644 index 000000000..f02cca59b --- /dev/null +++ b/website/reference/layer_locally_connected_1d.html @@ -0,0 +1,281 @@ + + + + + + + + +Locally-connected layer for 1D inputs. — layer_locally_connected_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    layer_locally_connected_1d() works similarly to layer_conv_1d() , except +that weights are unshared, that is, a different set of filters is applied at +each different patch of the input.

    + + +
    layer_locally_connected_1d(object, filters, kernel_size, strides = 1L,
    +  padding = "valid", data_format = NULL, activation = NULL,
    +  use_bias = TRUE, kernel_initializer = "glorot_uniform",
    +  bias_initializer = "zeros", kernel_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  kernel_constraint = NULL, bias_constraint = NULL, batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of a single integer, specifying the +length of the 1D convolution window.

    strides

    An integer or list of a single integer, specifying the stride +length of the convolution. Specifying any stride value != 1 is incompatible +with specifying any dilation_rate value != 1.

    padding

    Currently only supports "valid" (case-insensitive). "same" +may be supported in the future.

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape: (batch_size, steps, input_dim)

    + +

    Output shape

    + +

    3D tensor with shape: (batch_size, new_steps, filters) steps value might have changed due to padding or strides.

    + +

    See also

    + +

    Other locally connected layers: layer_locally_connected_2d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_locally_connected_2d.html b/website/reference/layer_locally_connected_2d.html new file mode 100644 index 000000000..f9fd13d3d --- /dev/null +++ b/website/reference/layer_locally_connected_2d.html @@ -0,0 +1,286 @@ + + + + + + + + +Locally-connected layer for 2D inputs. — layer_locally_connected_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    layer_locally_connected_2d works similarly to layer_conv_2d(), except +that weights are unshared, that is, a different set of filters is applied at +each different patch of the input.

    + + +
    layer_locally_connected_2d(object, filters, kernel_size, strides = c(1L, 1L),
    +  padding = "valid", data_format = NULL, activation = NULL,
    +  use_bias = TRUE, kernel_initializer = "glorot_uniform",
    +  bias_initializer = "zeros", kernel_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  kernel_constraint = NULL, bias_constraint = NULL, batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of 2 integers, specifying the width and +height of the 2D convolution window. Can be a single integer to specify the +same value for all spatial dimensions.

    strides

    An integer or list of 2 integers, specifying the strides of +the convolution along the width and height. Can be a single integer to +specify the same value for all spatial dimensions. Specifying any stride +value != 1 is incompatible with specifying any dilation_rate value != 1.

    padding

    Currently only supports "valid" (case-insensitive). "same" +may be supported in the future.

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, width, height, channels) while channels_first corresponds to inputs with shape (batch, channels, width, height). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    kernel_initializer

    Initializer for the kernel weights matrix.

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape: (samples, channels, rows, cols) +if data_format='channels_first' or 4D tensor with shape: (samples, rows, cols, channels) if data_format='channels_last'.

    + +

    Output shape

    + +

    4D tensor with shape: (samples, filters, new_rows, new_cols) if data_format='channels_first' or 4D tensor with shape: +(samples, new_rows, new_cols, filters) if data_format='channels_last'. +rows and cols values might have changed due to padding.

    + +

    See also

    + +

    Other locally connected layers: layer_locally_connected_1d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_lstm.html b/website/reference/layer_lstm.html new file mode 100644 index 000000000..16829f22f --- /dev/null +++ b/website/reference/layer_lstm.html @@ -0,0 +1,424 @@ + + + + + + + + +Long-Short Term Memory unit - Hochreiter 1997. — layer_lstm • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    For a step-by-step description of the algorithm, see thistutorial.

    + + +
    layer_lstm(object, units, activation = "tanh",
    +  recurrent_activation = "hard_sigmoid", use_bias = TRUE,
    +  return_sequences = FALSE, return_state = FALSE, go_backwards = FALSE,
    +  stateful = FALSE, unroll = FALSE, implementation = 0L,
    +  kernel_initializer = "glorot_uniform",
    +  recurrent_initializer = "orthogonal", bias_initializer = "zeros",
    +  unit_forget_bias = TRUE, kernel_regularizer = NULL,
    +  recurrent_regularizer = NULL, bias_regularizer = NULL,
    +  activity_regularizer = NULL, kernel_constraint = NULL,
    +  recurrent_constraint = NULL, bias_constraint = NULL, dropout = 0,
    +  recurrent_dropout = 0, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    units

    Positive integer, dimensionality of the output space.

    activation

    Activation function to use. If you pass NULL, no +activation is applied (ie. "linear" activation: a(x) = x).

    recurrent_activation

    Activation function to use for the recurrent +step.

    use_bias

    Boolean, whether the layer uses a bias vector.

    return_sequences

    Boolean. Whether to return the last output in the +output sequence, or the full sequence.

    return_state

    Boolean (default FALSE). Whether to return the last state +in addition to the output.

    go_backwards

    Boolean (default FALSE). If TRUE, process the input +sequence backwards and return the reversed sequence.

    stateful

    Boolean (default FALSE). If TRUE, the last state for each +sample at index i in a batch will be used as initial state for the sample +of index i in the following batch.

    unroll

    Boolean (default FALSE). If TRUE, the network will be unrolled, +else a symbolic loop will be used. Unrolling can speed-up a RNN, although +it tends to be more memory-intensive. Unrolling is only suitable for short +sequences.

    implementation

    one of 0, 1, or 2. If set to 0, the RNN will use an +implementation that uses fewer, larger matrix products, thus running faster +on CPU but consuming more memory. If set to 1, the RNN will use more matrix +products, but smaller ones, thus running slower (may actually be faster on +GPU) while consuming less memory. If set to 2 (LSTM/GRU only), the RNN will +combine the input gate, the forget gate and the output gate into a single +matrix, enabling more time-efficient parallelization on the GPU.

    kernel_initializer

    Initializer for the kernel weights matrix, used +for the linear transformation of the inputs..

    recurrent_initializer

    Initializer for the recurrent_kernel weights +matrix, used for the linear transformation of the recurrent state..

    bias_initializer

    Initializer for the bias vector.

    unit_forget_bias

    Boolean. If TRUE, add 1 to the bias of the forget +gate at initialization. Setting it to true will also force +bias_initializer="zeros". This is recommended in Jozefowicz etal.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    recurrent_regularizer

    Regularizer function applied to the +recurrent_kernel weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel weights +matrix.

    recurrent_constraint

    Constraint function applied to the +recurrent_kernel weights matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    dropout

    Float between 0 and 1. Fraction of the units to drop for the +linear transformation of the inputs.

    recurrent_dropout

    Float between 0 and 1. Fraction of the units to drop +for the linear transformation of the recurrent state.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shapes

    + + +

    3D tensor with shape (batch_size, timesteps, input_dim), +(Optional) 2D tensors with shape (batch_size, output_dim).

    + +

    Output shape

    + + +
      +
    • if return_state: a list of tensors. The first tensor is +the output. The remaining tensors are the last states, +each with shape (batch_size, units).

    • +
    • if return_sequences: 3D tensor with shape +(batch_size, timesteps, units).

    • +
    • else, 2D tensor with shape (batch_size, units).

    • +
    + +

    Masking

    + + +

    This layer supports masking for input data with a variable number +of timesteps. To introduce masks to your data, +use an embedding layer with the mask_zero parameter +set to TRUE.

    + +

    Statefulness in RNNs

    + + +

    You can set RNN layers to be 'stateful', which means that the states +computed for the samples in one batch will be reused as initial states +for the samples in the next batch. This assumes a one-to-one mapping +between samples in different successive batches.

    +

    To enable statefulness:

      +
    • Specify stateful=TRUE in the layer constructor.

    • +
    • Specify a fixed batch size for your model. For sequential models, +pass batch_input_shape = c(...) to the first layer in your model. +For functional models with 1 or more Input layers, pass +batch_shape = c(...) to all the first layers in your model. +This is the expected shape of your inputs including the batch size. +It should be a vector of integers, e.g. c(32, 10, 100).

    • +
    • Specify shuffle = FALSE when calling fit().

    • +
    +

    To reset the states of your model, call reset_states() on either +a specific layer, or on your entire model.

    + +

    Initial State of RNNs

    + + +

    You can specify the initial state of RNN layers symbolically by calling +them with the keyword argument initial_state. The value of +initial_state should be a tensor or list of tensors representing +the initial state of the RNN layer.

    +

    You can specify the initial state of RNN layers numerically by +calling reset_states with the keyword argument states. The value of +states should be a numpy array or list of numpy arrays representing +the initial state of the RNN layer.

    + +

    References

    + + + + +

    See also

    + +

    Other recurrent layers: layer_gru, + layer_simple_rnn

    +

    Other recurrent layers: layer_gru, + layer_simple_rnn

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_masking.html b/website/reference/layer_masking.html new file mode 100644 index 000000000..acc00d595 --- /dev/null +++ b/website/reference/layer_masking.html @@ -0,0 +1,227 @@ + + + + + + + + +Masks a sequence by using a mask value to skip timesteps. — layer_masking • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    For each timestep in the input tensor (dimension #1 in the tensor), if all +values in the input tensor at that timestep are equal to mask_value, then +the timestep will be masked (skipped) in all downstream layers (as long as +they support masking). If any downstream layer does not support masking yet +receives such an input mask, an exception will be raised.

    + + +
    layer_masking(object, mask_value = 0, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    mask_value

    float, mask value

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_lambda, layer_permute, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_max_pooling_1d.html b/website/reference/layer_max_pooling_1d.html new file mode 100644 index 000000000..f0826aa5d --- /dev/null +++ b/website/reference/layer_max_pooling_1d.html @@ -0,0 +1,230 @@ + + + + + + + + +Max pooling operation for temporal data. — layer_max_pooling_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Max pooling operation for temporal data.

    + + +
    layer_max_pooling_1d(object, pool_size = 2L, strides = NULL,
    +  padding = "valid", batch_size = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    pool_size

    Integer, size of the max pooling windows.

    strides

    Integer, or NULL. Factor by which to downscale. E.g. 2 will +halve the input. If NULL, it will default to pool_size.

    padding

    One of "valid" or "same" (case-insensitive).

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape: (batch_size, steps, features).

    + +

    Output shape

    + +

    3D tensor with shape: (batch_size, downsampled_steps, features).

    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_2d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_max_pooling_2d.html b/website/reference/layer_max_pooling_2d.html new file mode 100644 index 000000000..ad9889f10 --- /dev/null +++ b/website/reference/layer_max_pooling_2d.html @@ -0,0 +1,249 @@ + + + + + + + + +Max pooling operation for spatial data. — layer_max_pooling_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Max pooling operation for spatial data.

    + + +
    layer_max_pooling_2d(object, pool_size = c(2L, 2L), strides = NULL,
    +  padding = "valid", data_format = NULL, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    pool_size

    integer or list of 2 integers, factors by which to downscale +(vertical, horizontal). (2, 2) will halve the input in both spatial +dimension. If only one integer is specified, the same window length will be +used for both dimensions.

    strides

    Integer, list of 2 integers, or NULL. Strides values. If NULL, +it will default to pool_size.

    padding

    One of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 4D tensor with shape: (batch_size, rows, cols, channels)

    • +
    • If data_format='channels_first': 4D tensor with shape: (batch_size, channels, rows, cols)

    • +
    + +

    Output shape

    + + +
      +
    • If data_format='channels_last': 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels)

    • +
    • If data_format='channels_first': 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols)

    • +
    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_max_pooling_3d.html b/website/reference/layer_max_pooling_3d.html new file mode 100644 index 000000000..7751f57f7 --- /dev/null +++ b/website/reference/layer_max_pooling_3d.html @@ -0,0 +1,248 @@ + + + + + + + + +Max pooling operation for 3D data (spatial or spatio-temporal). — layer_max_pooling_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Max pooling operation for 3D data (spatial or spatio-temporal).

    + + +
    layer_max_pooling_3d(object, pool_size = c(2L, 2L, 2L), strides = NULL,
    +  padding = "valid", data_format = NULL, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    pool_size

    list of 3 integers, factors by which to downscale (dim1, +dim2, dim3). (2, 2, 2) will halve the size of the 3D input in each +dimension.

    strides

    list of 3 integers, or NULL. Strides values.

    padding

    One of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +
      +
    • If data_format='channels_last': 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)

    • +
    • If data_format='channels_first': 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)

    • +
    + +

    Output shape

    + + +
      +
    • If data_format='channels_last': 5D tensor with shape: (batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels)

    • +
    • If data_format='channels_first': 5D tensor with shape: (batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3)

    • +
    + +

    See also

    + +

    Other pooling layers: layer_average_pooling_1d, + layer_average_pooling_2d, + layer_average_pooling_3d, + layer_global_average_pooling_1d, + layer_global_average_pooling_2d, + layer_global_average_pooling_3d, + layer_global_max_pooling_1d, + layer_global_max_pooling_2d, + layer_global_max_pooling_3d, + layer_max_pooling_1d, + layer_max_pooling_2d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_maximum.html b/website/reference/layer_maximum.html new file mode 100644 index 000000000..67a347403 --- /dev/null +++ b/website/reference/layer_maximum.html @@ -0,0 +1,185 @@ + + + + + + + + +Layer that computes the maximum (element-wise) a list of inputs. — layer_maximum • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It takes as input a list of tensors, all of the same shape, and returns a +single tensor (also of the same shape).

    + + +
    layer_maximum(inputs)
    + +

    Arguments

    + + + + + + +
    inputs

    A list of input tensors (at least 2).

    + +

    Value

    + +

    A tensor, the element-wise maximum of the inputs.

    + +

    See also

    + +

    Other merge layers: layer_add, + layer_average, + layer_concatenate, layer_dot, + layer_multiply

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_multiply.html b/website/reference/layer_multiply.html new file mode 100644 index 000000000..f0c0e0b49 --- /dev/null +++ b/website/reference/layer_multiply.html @@ -0,0 +1,185 @@ + + + + + + + + +Layer that multiplies (element-wise) a list of inputs. — layer_multiply • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    It takes as input a list of tensors, all of the same shape, and returns a +single tensor (also of the same shape).

    + + +
    layer_multiply(inputs)
    + +

    Arguments

    + + + + + + +
    inputs

    A list of input tensors (at least 2).

    + +

    Value

    + +

    A tensor, the element-wise product of the inputs.

    + +

    See also

    + +

    Other merge layers: layer_add, + layer_average, + layer_concatenate, layer_dot, + layer_maximum

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_permute.html b/website/reference/layer_permute.html new file mode 100644 index 000000000..9a8247ac9 --- /dev/null +++ b/website/reference/layer_permute.html @@ -0,0 +1,240 @@ + + + + + + + + +Permute the dimensions of an input according to a given pattern — layer_permute • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Permute the dimensions of an input according to a given pattern

    + + +
    layer_permute(object, dims, input_shape = NULL, batch_input_shape = NULL,
    +  batch_size = NULL, dtype = NULL, name = NULL, trainable = NULL,
    +  weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    dims

    List of integers. Permutation pattern, does not include the +samples dimension. Indexing starts at 1. For instance, (2, 1) permutes +the first and second dimension of the input.

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Note

    + +

    Useful for e.g. connecting RNNs and convnets together.

    + +

    Input and Output Shapes

    + + +

    Input shape: Arbitrary

    +

    Output shape: Same as the input shape, but with the dimensions re-ordered +according to the specified pattern.

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_lambda, layer_masking, + layer_repeat_vector, + layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_repeat_vector.html b/website/reference/layer_repeat_vector.html new file mode 100644 index 000000000..d18118084 --- /dev/null +++ b/website/reference/layer_repeat_vector.html @@ -0,0 +1,215 @@ + + + + + + + + +Repeats the input n times. — layer_repeat_vector • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Repeats the input n times.

    + + +
    layer_repeat_vector(object, n, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    n

    integer, repetition factor.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    2D tensor of shape (num_samples, features).

    + +

    Output shape

    + +

    3D tensor of shape (num_samples, n, features).

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_lambda, layer_masking, + layer_permute, layer_reshape

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_reshape.html b/website/reference/layer_reshape.html new file mode 100644 index 000000000..649fb14f5 --- /dev/null +++ b/website/reference/layer_reshape.html @@ -0,0 +1,233 @@ + + + + + + + + +Reshapes an output to a certain shape. — layer_reshape • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Reshapes an output to a certain shape.

    + + +
    layer_reshape(object, target_shape, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    target_shape

    List of integers, does not include the samples dimension +(batch size).

    input_shape

    Input shape (list of integers, does not include the +samples axis) which is required when using this layer as the first layer in +a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input and Output Shapes

    + + +

    Input shape: Arbitrary, although all dimensions in the input shaped must be +fixed.

    +

    Output shape: (batch_size,) + target_shape.

    + +

    See also

    + +

    Other core layers: layer_activation, + layer_activity_regularization, + layer_dense, layer_dropout, + layer_flatten, layer_input, + layer_lambda, layer_masking, + layer_permute, + layer_repeat_vector

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_separable_conv_2d.html b/website/reference/layer_separable_conv_2d.html new file mode 100644 index 000000000..2de8ce114 --- /dev/null +++ b/website/reference/layer_separable_conv_2d.html @@ -0,0 +1,327 @@ + + + + + + + + +Depthwise separable 2D convolution. — layer_separable_conv_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Separable convolutions consist in first performing a depthwise spatial +convolution (which acts on each input channel separately) followed by a +pointwise convolution which mixes together the resulting output channels. The +depth_multiplier argument controls how many output channels are generated +per input channel in the depthwise step. Intuitively, separable convolutions +can be understood as a way to factorize a convolution kernel into two smaller +kernels, or as an extreme version of an Inception block.

    + + +
    layer_separable_conv_2d(object, filters, kernel_size, strides = c(1L, 1L),
    +  padding = "valid", data_format = NULL, depth_multiplier = 1L,
    +  activation = NULL, use_bias = TRUE,
    +  depthwise_initializer = "glorot_uniform",
    +  pointwise_initializer = "glorot_uniform", bias_initializer = "zeros",
    +  depthwise_regularizer = NULL, pointwise_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  depthwise_constraint = NULL, pointwise_constraint = NULL,
    +  bias_constraint = NULL, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    filters

    Integer, the dimensionality of the output space (i.e. the +number output of filters in the convolution).

    kernel_size

    An integer or list of 2 integers, specifying the width and +height of the 2D convolution window. Can be a single integer to specify the +same value for all spatial dimensions.

    strides

    An integer or list of 2 integers, specifying the strides of +the convolution along the width and height. Can be a single integer to +specify the same value for all spatial dimensions. Specifying any stride +value != 1 is incompatible with specifying any dilation_rate value != 1.

    padding

    one of "valid" or "same" (case-insensitive).

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    depth_multiplier

    The number of depthwise convolution output channels +for each input channel. The total number of depthwise convolution output +channels will be equal to filterss_in * depth_multiplier.

    activation

    Activation function to use. If you don't specify anything, +no activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    depthwise_initializer

    Initializer for the depthwise kernel matrix.

    pointwise_initializer

    Initializer for the pointwise kernel matrix.

    bias_initializer

    Initializer for the bias vector.

    depthwise_regularizer

    Regularizer function applied to the depthwise +kernel matrix.

    pointwise_regularizer

    Regularizer function applied to the depthwise +kernel matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    depthwise_constraint

    Constraint function applied to the depthwise +kernel matrix.

    pointwise_constraint

    Constraint function applied to the pointwise +kernel matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape: (batch, channels, rows, cols) +if data_format='channels_first' or 4D tensor with shape: (batch, rows, cols, channels) if data_format='channels_last'.

    + +

    Output shape

    + +

    4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format='channels_first' or 4D tensor with shape: +(batch, new_rows, new_cols, filters) if data_format='channels_last'. +rows and cols values might have changed due to padding.

    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_simple_rnn.html b/website/reference/layer_simple_rnn.html new file mode 100644 index 000000000..2cc10cf03 --- /dev/null +++ b/website/reference/layer_simple_rnn.html @@ -0,0 +1,407 @@ + + + + + + + + +Fully-connected RNN where the output is to be fed back to input. — layer_simple_rnn • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Fully-connected RNN where the output is to be fed back to input.

    + + +
    layer_simple_rnn(object, units, activation = "tanh", use_bias = TRUE,
    +  return_sequences = FALSE, return_state = FALSE, go_backwards = FALSE,
    +  stateful = FALSE, unroll = FALSE, implementation = 0L,
    +  kernel_initializer = "glorot_uniform",
    +  recurrent_initializer = "orthogonal", bias_initializer = "zeros",
    +  kernel_regularizer = NULL, recurrent_regularizer = NULL,
    +  bias_regularizer = NULL, activity_regularizer = NULL,
    +  kernel_constraint = NULL, recurrent_constraint = NULL,
    +  bias_constraint = NULL, dropout = 0, recurrent_dropout = 0,
    +  input_shape = NULL, batch_input_shape = NULL, batch_size = NULL,
    +  dtype = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    units

    Positive integer, dimensionality of the output space.

    activation

    Activation function to use. If you pass NULL, no +activation is applied (ie. "linear" activation: a(x) = x).

    use_bias

    Boolean, whether the layer uses a bias vector.

    return_sequences

    Boolean. Whether to return the last output in the +output sequence, or the full sequence.

    return_state

    Boolean (default FALSE). Whether to return the last state +in addition to the output.

    go_backwards

    Boolean (default FALSE). If TRUE, process the input +sequence backwards and return the reversed sequence.

    stateful

    Boolean (default FALSE). If TRUE, the last state for each +sample at index i in a batch will be used as initial state for the sample +of index i in the following batch.

    unroll

    Boolean (default FALSE). If TRUE, the network will be unrolled, +else a symbolic loop will be used. Unrolling can speed-up a RNN, although +it tends to be more memory-intensive. Unrolling is only suitable for short +sequences.

    implementation

    one of 0, 1, or 2. If set to 0, the RNN will use an +implementation that uses fewer, larger matrix products, thus running faster +on CPU but consuming more memory. If set to 1, the RNN will use more matrix +products, but smaller ones, thus running slower (may actually be faster on +GPU) while consuming less memory. If set to 2 (LSTM/GRU only), the RNN will +combine the input gate, the forget gate and the output gate into a single +matrix, enabling more time-efficient parallelization on the GPU.

    kernel_initializer

    Initializer for the kernel weights matrix, used +for the linear transformation of the inputs..

    recurrent_initializer

    Initializer for the recurrent_kernel weights +matrix, used for the linear transformation of the recurrent state..

    bias_initializer

    Initializer for the bias vector.

    kernel_regularizer

    Regularizer function applied to the kernel +weights matrix.

    recurrent_regularizer

    Regularizer function applied to the +recurrent_kernel weights matrix.

    bias_regularizer

    Regularizer function applied to the bias vector.

    activity_regularizer

    Regularizer function applied to the output of the +layer (its "activation")..

    kernel_constraint

    Constraint function applied to the kernel weights +matrix.

    recurrent_constraint

    Constraint function applied to the +recurrent_kernel weights matrix.

    bias_constraint

    Constraint function applied to the bias vector.

    dropout

    Float between 0 and 1. Fraction of the units to drop for the +linear transformation of the inputs.

    recurrent_dropout

    Float between 0 and 1. Fraction of the units to drop +for the linear transformation of the recurrent state.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shapes

    + + +

    3D tensor with shape (batch_size, timesteps, input_dim), +(Optional) 2D tensors with shape (batch_size, output_dim).

    + +

    Output shape

    + + +
      +
    • if return_state: a list of tensors. The first tensor is +the output. The remaining tensors are the last states, +each with shape (batch_size, units).

    • +
    • if return_sequences: 3D tensor with shape +(batch_size, timesteps, units).

    • +
    • else, 2D tensor with shape (batch_size, units).

    • +
    + +

    Masking

    + + +

    This layer supports masking for input data with a variable number +of timesteps. To introduce masks to your data, +use an embedding layer with the mask_zero parameter +set to TRUE.

    + +

    Statefulness in RNNs

    + + +

    You can set RNN layers to be 'stateful', which means that the states +computed for the samples in one batch will be reused as initial states +for the samples in the next batch. This assumes a one-to-one mapping +between samples in different successive batches.

    +

    To enable statefulness:

      +
    • Specify stateful=TRUE in the layer constructor.

    • +
    • Specify a fixed batch size for your model. For sequential models, +pass batch_input_shape = c(...) to the first layer in your model. +For functional models with 1 or more Input layers, pass +batch_shape = c(...) to all the first layers in your model. +This is the expected shape of your inputs including the batch size. +It should be a vector of integers, e.g. c(32, 10, 100).

    • +
    • Specify shuffle = FALSE when calling fit().

    • +
    +

    To reset the states of your model, call reset_states() on either +a specific layer, or on your entire model.

    + +

    Initial State of RNNs

    + + +

    You can specify the initial state of RNN layers symbolically by calling +them with the keyword argument initial_state. The value of +initial_state should be a tensor or list of tensors representing +the initial state of the RNN layer.

    +

    You can specify the initial state of RNN layers numerically by +calling reset_states with the keyword argument states. The value of +states should be a numpy array or list of numpy arrays representing +the initial state of the RNN layer.

    + +

    References

    + + + + +

    See also

    + +

    Other recurrent layers: layer_gru, + layer_lstm

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_spatial_dropout_1d.html b/website/reference/layer_spatial_dropout_1d.html new file mode 100644 index 000000000..be5bbe52c --- /dev/null +++ b/website/reference/layer_spatial_dropout_1d.html @@ -0,0 +1,224 @@ + + + + + + + + +Spatial 1D version of Dropout. — layer_spatial_dropout_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This version performs the same function as Dropout, however it drops entire +1D feature maps instead of individual elements. If adjacent frames within +feature maps are strongly correlated (as is normally the case in early +convolution layers) then regular dropout will not regularize the activations +and will otherwise just result in an effective learning rate decrease. In +this case, layer_spatial_dropout_1d will help promote independence between +feature maps and should be used instead.

    + + +
    layer_spatial_dropout_1d(object, rate, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    rate

    float between 0 and 1. Fraction of the input units to drop.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape: (samples, timesteps, channels)

    + +

    Output shape

    + +

    Same as input

    + +

    References

    + +

    - Efficient Object Localization Using ConvolutionalNetworks

    + +

    See also

    + +

    Other dropout layers: layer_dropout, + layer_spatial_dropout_2d, + layer_spatial_dropout_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_spatial_dropout_2d.html b/website/reference/layer_spatial_dropout_2d.html new file mode 100644 index 000000000..8cf91b747 --- /dev/null +++ b/website/reference/layer_spatial_dropout_2d.html @@ -0,0 +1,233 @@ + + + + + + + + +Spatial 2D version of Dropout. — layer_spatial_dropout_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This version performs the same function as Dropout, however it drops entire +2D feature maps instead of individual elements. If adjacent pixels within +feature maps are strongly correlated (as is normally the case in early +convolution layers) then regular dropout will not regularize the activations +and will otherwise just result in an effective learning rate decrease. In +this case, layer_spatial_dropout_2d will help promote independence between +feature maps and should be used instead.

    + + +
    layer_spatial_dropout_2d(object, rate, data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    rate

    float between 0 and 1. Fraction of the input units to drop.

    data_format

    'channels_first' or 'channels_last'. In 'channels_first' +mode, the channels dimension (the depth) is at index 1, in 'channels_last' +mode is it at index 3. It defaults to the image_data_format value found +in your Keras config file at ~/.keras/keras.json. If you never set it, +then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape: (samples, channels, rows, cols) +if data_format='channels_first' or 4D tensor with shape: (samples, rows, cols, channels) if data_format='channels_last'.

    + +

    Output shape

    + +

    Same as input

    + +

    References

    + +

    - Efficient Object Localization Using ConvolutionalNetworks

    + +

    See also

    + +

    Other dropout layers: layer_dropout, + layer_spatial_dropout_1d, + layer_spatial_dropout_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_spatial_dropout_3d.html b/website/reference/layer_spatial_dropout_3d.html new file mode 100644 index 000000000..8faad1bcb --- /dev/null +++ b/website/reference/layer_spatial_dropout_3d.html @@ -0,0 +1,232 @@ + + + + + + + + +Spatial 3D version of Dropout. — layer_spatial_dropout_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This version performs the same function as Dropout, however it drops entire +3D feature maps instead of individual elements. If adjacent voxels within +feature maps are strongly correlated (as is normally the case in early +convolution layers) then regular dropout will not regularize the activations +and will otherwise just result in an effective learning rate decrease. In +this case, layer_spatial_dropout_3d will help promote independence between +feature maps and should be used instead.

    + + +
    layer_spatial_dropout_3d(object, rate, data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    rate

    float between 0 and 1. Fraction of the input units to drop.

    data_format

    'channels_first' or 'channels_last'. In 'channels_first' +mode, the channels dimension (the depth) is at index 1, in 'channels_last' +mode is it at index 4. It defaults to the image_data_format value found +in your Keras config file at ~/.keras/keras.json. If you never set it, +then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    5D tensor with shape: (samples, channels, dim1, dim2, dim3) if data_format='channels_first' or 5D tensor with shape: (samples, dim1, dim2, dim3, channels) if data_format='channels_last'.

    + +

    Output shape

    + +

    Same as input

    + +

    References

    + +

    - Efficient Object Localization Using ConvolutionalNetworks

    + +

    See also

    + +

    Other dropout layers: layer_dropout, + layer_spatial_dropout_1d, + layer_spatial_dropout_2d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_upsampling_1d.html b/website/reference/layer_upsampling_1d.html new file mode 100644 index 000000000..cbdb2a829 --- /dev/null +++ b/website/reference/layer_upsampling_1d.html @@ -0,0 +1,224 @@ + + + + + + + + +Upsampling layer for 1D inputs. — layer_upsampling_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Repeats each temporal step size times along the time axis.

    + + +
    layer_upsampling_1d(object, size = 2L, batch_size = NULL, name = NULL,
    +  trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    size

    integer. Upsampling factor.

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    3D tensor with shape: (batch, steps, features).

    + +

    Output shape

    + +

    3D tensor with shape: (batch, upsampled_steps, features).

    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_upsampling_2d.html b/website/reference/layer_upsampling_2d.html new file mode 100644 index 000000000..2db12a876 --- /dev/null +++ b/website/reference/layer_upsampling_2d.html @@ -0,0 +1,243 @@ + + + + + + + + +Upsampling layer for 2D inputs. — layer_upsampling_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Repeats the rows and columns of the data by size[[0]] and size[[1]] respectively.

    +

    [[0]: R:[0 +[[1]: R:[1

    + + +
    layer_upsampling_2d(object, size = c(2L, 2L), data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    size

    int, or list of 2 integers. The upsampling factors for rows and +columns.

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +

    4D tensor with shape:

      +
    • If data_format is "channels_last": (batch, rows, cols, channels)

    • +
    • If data_format is "channels_first": (batch, channels, rows, cols)

    • +
    + +

    Output shape

    + + +

    4D tensor with shape:

      +
    • If data_format is "channels_last": (batch, upsampled_rows, upsampled_cols, channels)

    • +
    • If data_format is "channels_first": (batch, channels, upsampled_rows, upsampled_cols)

    • +
    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_upsampling_3d.html b/website/reference/layer_upsampling_3d.html new file mode 100644 index 000000000..8e78b4933 --- /dev/null +++ b/website/reference/layer_upsampling_3d.html @@ -0,0 +1,246 @@ + + + + + + + + +Upsampling layer for 3D inputs. — layer_upsampling_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Repeats the 1st, 2nd and 3rd dimensions of the data by size[[0]], size[[1]] and +size[[2]] respectively.

    +

    [[0]: R:[0 +[[1]: R:[1 +[[2]: R:[2

    + + +
    layer_upsampling_3d(object, size = c(2L, 2L, 2L), data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    size

    int, or list of 3 integers. The upsampling factors for dim1, dim2 +and dim3.

    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +

    5D tensor with shape:

      +
    • If data_format is "channels_last": (batch, dim1, dim2, dim3, channels)

    • +
    • If data_format is "channels_first": (batch, channels, dim1, dim2, dim3)

    • +
    + +

    Output shape

    + + +

    5D tensor with shape:

      +
    • If data_format is "channels_last": (batch, upsampled_dim1, upsampled_dim2, upsampled_dim3, channels)

    • +
    • If data_format is "channels_first": (batch, channels, upsampled_dim1, upsampled_dim2, upsampled_dim3)

    • +
    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_zero_padding_1d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_zero_padding_1d.html b/website/reference/layer_zero_padding_1d.html new file mode 100644 index 000000000..e4d1e9e68 --- /dev/null +++ b/website/reference/layer_zero_padding_1d.html @@ -0,0 +1,229 @@ + + + + + + + + +Zero-padding layer for 1D input (e.g. temporal sequence). — layer_zero_padding_1d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Zero-padding layer for 1D input (e.g. temporal sequence).

    + + +
    layer_zero_padding_1d(object, padding = 1L, batch_size = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    padding

    int, or list of int (length 2)

      +
    • If int: How many zeros to add at the beginning and end of the padding dimension (axis 1).

    • +
    • If list of int (length 2): How many zeros to add at the beginning and at the end of the padding dimension ((left_pad, right_pad)).

    • +
    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + + +

    3D tensor with shape (batch, axis_to_pad, features)

    + +

    Output shape

    + + +

    3D tensor with shape (batch, padded_axis, features)

    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_2d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_zero_padding_2d.html b/website/reference/layer_zero_padding_2d.html new file mode 100644 index 000000000..1ba4d9e54 --- /dev/null +++ b/website/reference/layer_zero_padding_2d.html @@ -0,0 +1,244 @@ + + + + + + + + +Zero-padding layer for 2D input (e.g. picture). — layer_zero_padding_2d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This layer can add rows and columns of zeros at the top, bottom, left and +right side of an image tensor.

    + + +
    layer_zero_padding_2d(object, padding = c(1L, 1L), data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    padding

    int, or list of 2 ints, or list of 2 lists of 2 ints.

      +
    • If int: the same symmetric padding is applied to width and height.

    • +
    • If list of 2 ints: interpreted as two different symmetric padding values for height +and width: (symmetric_height_pad, symmetric_width_pad).

    • +
    • If list of 2 lists of 2 ints: interpreted as ((top_pad, bottom_pad), (left_pad, right_pad))

    • +
    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value +found in your Keras config file at ~/.keras/keras.json. If you never set +it, then it will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    4D tensor with shape:

      +
    • If data_format is "channels_last": (batch, rows, cols, channels)

    • +
    • If data_format is "channels_first": (batch, channels, rows, cols)

    • +
    + +

    Output shape

    + +

    4D tensor with shape:

      +
    • If data_format is "channels_last": (batch, padded_rows, padded_cols, channels)

    • +
    • If data_format is "channels_first": (batch, channels, padded_rows, padded_cols)

    • +
    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_3d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/layer_zero_padding_3d.html b/website/reference/layer_zero_padding_3d.html new file mode 100644 index 000000000..eb073c501 --- /dev/null +++ b/website/reference/layer_zero_padding_3d.html @@ -0,0 +1,244 @@ + + + + + + + + +Zero-padding layer for 3D data (spatial or spatio-temporal). — layer_zero_padding_3d • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Zero-padding layer for 3D data (spatial or spatio-temporal).

    + + +
    layer_zero_padding_3d(object, padding = c(1L, 1L, 1L), data_format = NULL,
    +  batch_size = NULL, name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    padding

    int, or list of 3 ints, or list of 3 lists of 2 ints.

      +
    • If int: the same symmetric padding is applied to width and height.

    • +
    • If list of 3 ints: interpreted as three different symmetric padding values: +(symmetric_dim1_pad, symmetric_dim2_pad, symmetric_dim3_pad).

    • +
    • If list of 3 lists of 2 ints: interpreted as ((left_dim1_pad, right_dim1_pad), (left_dim2_pad, right_dim2_pad), (left_dim3_pad, right_dim3_pad))

    • +
    data_format

    A string, one of channels_last (default) or +channels_first. The ordering of the dimensions in the inputs. +channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds +to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3). It defaults to the image_data_format value found in your +Keras config file at ~/.keras/keras.json. If you never set it, then it +will be "channels_last".

    batch_size

    Fixed batch size for layer

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Input shape

    + +

    5D tensor with shape:

      +
    • If data_format is "channels_last": (batch, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad, depth)

    • +
    • If data_format is "channels_first": (batch, depth, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad)

    • +
    + +

    Output shape

    + +

    5D tensor with shape:

      +
    • If data_format is "channels_last": (batch, first_padded_axis, second_padded_axis, third_axis_to_pad, depth)

    • +
    • If data_format is "channels_first": (batch, depth, first_padded_axis, second_padded_axis, third_axis_to_pad)

    • +
    + +

    See also

    + +

    Other convolutional layers: layer_conv_1d, + layer_conv_2d_transpose, + layer_conv_2d, + layer_conv_3d_transpose, + layer_conv_3d, + layer_conv_lstm_2d, + layer_cropping_1d, + layer_cropping_2d, + layer_cropping_3d, + layer_separable_conv_2d, + layer_upsampling_1d, + layer_upsampling_2d, + layer_upsampling_3d, + layer_zero_padding_1d, + layer_zero_padding_2d

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/loss_mean_squared_error.html b/website/reference/loss_mean_squared_error.html new file mode 100644 index 000000000..27c8e7e4c --- /dev/null +++ b/website/reference/loss_mean_squared_error.html @@ -0,0 +1,235 @@ + + + + + + + + +Model loss functions — loss_mean_squared_error • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Model loss functions

    + + +
    loss_mean_squared_error(y_true, y_pred)
    +
    +loss_mean_absolute_error(y_true, y_pred)
    +
    +loss_mean_absolute_percentage_error(y_true, y_pred)
    +
    +loss_mean_squared_logarithmic_error(y_true, y_pred)
    +
    +loss_squared_hinge(y_true, y_pred)
    +
    +loss_hinge(y_true, y_pred)
    +
    +loss_categorical_hinge(y_true, y_pred)
    +
    +loss_logcosh(y_true, y_pred)
    +
    +loss_categorical_crossentropy(y_true, y_pred)
    +
    +loss_sparse_categorical_crossentropy(y_true, y_pred)
    +
    +loss_binary_crossentropy(y_true, y_pred)
    +
    +loss_kullback_leibler_divergence(y_true, y_pred)
    +
    +loss_poisson(y_true, y_pred)
    +
    +loss_cosine_proximity(y_true, y_pred)
    + +

    Arguments

    + + + + + + + + + + +
    y_true

    True labels (Tensor)

    y_pred

    Predictions (Tensor of the same shape as y_true)

    + +

    Details

    + +

    Loss functions are to be supplied in the loss parameter of the +compile() function.

    +

    Loss functions can be specified either using the name of a built in loss +function (e.g. 'loss = binary_crossentropy'), a reference to a built in loss +function (e.g. 'loss = loss_binary_crossentropy()') or by passing an +artitrary function that returns a scalar for each data-point and takes the +following two arguments:

      +
    • y_true True labels (Tensor)

    • +
    • y_pred Predictions (Tensor of the same shape as y_true)

    • +
    +

    The actual optimized objective is the mean of the output array across all +datapoints.

    + +

    Categorical Crossentropy

    + + +

    When using the categorical_crossentropy loss, your targets should be in +categorical format (e.g. if you have 10 classes, the target for each sample +should be a 10-dimensional vector that is all-zeros expect for a 1 at the +index corresponding to the class of the sample). In order to convert +integer targets into categorical targets, you can use the Keras utility +function to_categorical(): + categorical_labels <- to_categorical(int_labels, num_classes = NULL)

    + +

    See also

    + +

    compile()

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/make_sampling_table.html b/website/reference/make_sampling_table.html new file mode 100644 index 000000000..280d619ae --- /dev/null +++ b/website/reference/make_sampling_table.html @@ -0,0 +1,204 @@ + + + + + + + + +Generates a word rank-based probabilistic sampling table. — make_sampling_table • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    This generates an array where the ith element is the probability that a word +of rank i would be sampled, according to the sampling distribution used in +word2vec. The word2vec formula is: p(word) = min(1, +sqrt(word.frequency/sampling_factor) / (word.frequency/sampling_factor)) We +assume that the word frequencies follow Zipf's law (s=1) to derive a +numerical approximation of frequency(rank): frequency(rank) ~ 1/(rank * +(log(rank) + gamma) + 1/2 - 1/(12*rank)) where gamma is the Euler-Mascheroni +constant.

    + + +
    make_sampling_table(size, sampling_factor = 1e-05)
    + +

    Arguments

    + + + + + + + + + + +
    size

    int, number of possible words to sample.

    sampling_factor

    the sampling factor in the word2vec formula.

    + +

    Value

    + +

    An array of length size where the ith entry is the +probability that a word of rank i should be sampled.

    + +

    Note

    + +

    The word2vec formula is: p(word) = min(1, +sqrt(word.frequency/sampling_factor) / (word.frequency/sampling_factor))

    + +

    See also

    + +

    Other text preprocessing: pad_sequences, + skipgrams, + text_hashing_trick, + text_one_hot, + text_to_word_sequence

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/metric_binary_accuracy.html b/website/reference/metric_binary_accuracy.html new file mode 100644 index 000000000..13c70d5b7 --- /dev/null +++ b/website/reference/metric_binary_accuracy.html @@ -0,0 +1,238 @@ + + + + + + + + +Model performance metrics — metric_binary_accuracy • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Model performance metrics

    + + +
    metric_binary_accuracy(y_true, y_pred)
    +
    +metric_binary_crossentropy(y_true, y_pred)
    +
    +metric_categorical_accuracy(y_true, y_pred)
    +
    +metric_categorical_crossentropy(y_true, y_pred)
    +
    +metric_cosine_proximity(y_true, y_pred)
    +
    +metric_hinge(y_true, y_pred)
    +
    +metric_kullback_leibler_divergence(y_true, y_pred)
    +
    +metric_mean_absolute_error(y_true, y_pred)
    +
    +metric_mean_absolute_percentage_error(y_true, y_pred)
    +
    +metric_mean_squared_error(y_true, y_pred)
    +
    +metric_mean_squared_logarithmic_error(y_true, y_pred)
    +
    +metric_poisson(y_true, y_pred)
    +
    +metric_sparse_categorical_crossentropy(y_true, y_pred)
    +
    +metric_squared_hinge(y_true, y_pred)
    +
    +metric_top_k_categorical_accuracy(y_true, y_pred, k = 5)
    +
    +metric_sparse_top_k_categorical_accuracy(y_true, y_pred, k = 5)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    y_true

    True labels (tensor)

    y_pred

    Predictions (tensor of the same shape as y_true).

    k

    An integer, number of top elements to consider.

    + +

    Note

    + +

    Metric functions are to be supplied in the metrics parameter of the +compile() function.

    + +

    Custom Metrics

    + + +

    You can provide an arbitrary R function as a custom metric. Note that +the y_true and y_pred parameters are tensors, so computations on +them should use backend tensor functions. For example:

    # create metric using backend tensor functions
    +K <- backend()
    +metric_mean_pred <- function(y_true, y_pred) {
    +  K$mean(y_pred) 
    +}
    +    model 
    +  optimizer = optimizer_rmsprop(),
    +  loss = loss_binary_crossentropy,
    +  metrics = c('accuracy', 
    +              'mean_pred' = metric_mean_pred)
    +)
    +
    +

    Note that a name ('mean_pred') is provided for the custom metric +function. This name is used within training progress output.

    +

    Documentation on the available backend tensor functions can be +found at https://rstudio.github.io/keras/articles/backend.html#backend-functions.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/model_to_json.html b/website/reference/model_to_json.html new file mode 100644 index 000000000..3a817bdc7 --- /dev/null +++ b/website/reference/model_to_json.html @@ -0,0 +1,191 @@ + + + + + + + + +Model configuration as JSON — model_to_json • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Save and re-load models configurations as JSON. Note that the representation +does not include the weights, only the architecture.

    + + +
    model_to_json(object)
    +
    +model_from_json(json, custom_objects = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    object

    Model object to save

    json

    JSON with model configuration

    custom_objects

    Optional named list mapping names to custom classes or +functions to be considered during deserialization.

    + +

    See also

    + +

    Other model persistence: get_weights, + model_to_yaml, + save_model_hdf5, + save_model_weights_hdf5, + serialize_model

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/model_to_yaml.html b/website/reference/model_to_yaml.html new file mode 100644 index 000000000..c61e2d4a0 --- /dev/null +++ b/website/reference/model_to_yaml.html @@ -0,0 +1,191 @@ + + + + + + + + +Model configuration as YAML — model_to_yaml • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Save and re-load models configurations as YAML Note that the representation +does not include the weights, only the architecture.

    + + +
    model_to_yaml(object)
    +
    +model_from_yaml(yaml, custom_objects = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    object

    Model object to save

    yaml

    YAML with model configuration

    custom_objects

    Optional named list mapping names to custom classes or +functions to be considered during deserialization.

    + +

    See also

    + +

    Other model persistence: get_weights, + model_to_json, + save_model_hdf5, + save_model_weights_hdf5, + serialize_model

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/normalize.html b/website/reference/normalize.html new file mode 100644 index 000000000..7d1235699 --- /dev/null +++ b/website/reference/normalize.html @@ -0,0 +1,183 @@ + + + + + + + + +Normalize a matrix or nd-array — normalize • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Normalize a matrix or nd-array

    + + +
    normalize(x, axis = -1, order = 2)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    x

    Matrix or array to normalize

    axis

    Axis along which to normalize

    order

    Normalization order (e.g. 2 for L2 norm)

    + +

    Value

    + +

    A normalized copy of the array.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_adadelta.html b/website/reference/optimizer_adadelta.html new file mode 100644 index 000000000..26c3ab33b --- /dev/null +++ b/website/reference/optimizer_adadelta.html @@ -0,0 +1,210 @@ + + + + + + + + +Adadelta optimizer. — optimizer_adadelta • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Adadelta optimizer as described in ADADELTA: An Adaptive Learning RateMethod.

    + + +
    optimizer_adadelta(lr = 1, rho = 0.95, epsilon = 1e-08, decay = 0,
    +  clipnorm = NULL, clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    rho

    float >= 0. Decay factor.

    epsilon

    float >= 0. Fuzz factor.

    decay

    float >= 0. Learning rate decay over each update.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    Note

    + +

    It is recommended to leave the parameters of this optimizer at their +default values.

    + +

    See also

    + +

    Other optimizers: optimizer_adagrad, + optimizer_adamax, + optimizer_adam, + optimizer_nadam, + optimizer_rmsprop, + optimizer_sgd

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_adagrad.html b/website/reference/optimizer_adagrad.html new file mode 100644 index 000000000..6e13e88f9 --- /dev/null +++ b/website/reference/optimizer_adagrad.html @@ -0,0 +1,206 @@ + + + + + + + + +Adagrad optimizer. — optimizer_adagrad • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Adagrad optimizer as described in Adaptive Subgradient Methods for OnlineLearning and StochasticOptimization.

    + + +
    optimizer_adagrad(lr = 0.01, epsilon = 1e-08, decay = 0,
    +  clipnorm = NULL, clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    epsilon

    float >= 0. Fuzz factor.

    decay

    float >= 0. Learning rate decay over each update.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    Note

    + +

    It is recommended to leave the parameters of this optimizer at their +default values.

    + +

    See also

    + +

    Other optimizers: optimizer_adadelta, + optimizer_adamax, + optimizer_adam, + optimizer_nadam, + optimizer_rmsprop, + optimizer_sgd

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_adam.html b/website/reference/optimizer_adam.html new file mode 100644 index 000000000..756d323a5 --- /dev/null +++ b/website/reference/optimizer_adam.html @@ -0,0 +1,215 @@ + + + + + + + + +Adam optimizer — optimizer_adam • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Adam optimizer as described in Adam - A Method for StochasticOptimization.

    + + +
    optimizer_adam(lr = 0.001, beta_1 = 0.9, beta_2 = 0.999,
    +  epsilon = 1e-08, decay = 0, clipnorm = NULL, clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    beta_1

    The exponential decay rate for the 1st moment estimates. float, +0 < beta < 1. Generally close to 1.

    beta_2

    The exponential decay rate for the 2nd moment estimates. float, +0 < beta < 1. Generally close to 1.

    epsilon

    float >= 0. Fuzz factor.

    decay

    float >= 0. Learning rate decay over each update.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    Note

    + +

    Default parameters follow those provided in the original paper.

    + +

    See also

    + +

    Other optimizers: optimizer_adadelta, + optimizer_adagrad, + optimizer_adamax, + optimizer_nadam, + optimizer_rmsprop, + optimizer_sgd

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_adamax.html b/website/reference/optimizer_adamax.html new file mode 100644 index 000000000..7576af548 --- /dev/null +++ b/website/reference/optimizer_adamax.html @@ -0,0 +1,210 @@ + + + + + + + + +Adamax optimizer — optimizer_adamax • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Adamax optimizer from Section 7 of the Adam paper. +It is a variant of Adam based on the infinity norm.

    + + +
    optimizer_adamax(lr = 0.002, beta_1 = 0.9, beta_2 = 0.999,
    +  epsilon = 1e-08, decay = 0, clipnorm = NULL, clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    beta_1

    The exponential decay rate for the 1st moment estimates. float, +0 < beta < 1. Generally close to 1.

    beta_2

    The exponential decay rate for the 2nd moment estimates. float, +0 < beta < 1. Generally close to 1.

    epsilon

    float >= 0. Fuzz factor.

    decay

    float >= 0. Learning rate decay over each update.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    See also

    + +

    Other optimizers: optimizer_adadelta, + optimizer_adagrad, + optimizer_adam, + optimizer_nadam, + optimizer_rmsprop, + optimizer_sgd

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_nadam.html b/website/reference/optimizer_nadam.html new file mode 100644 index 000000000..dcf1e2cf9 --- /dev/null +++ b/website/reference/optimizer_nadam.html @@ -0,0 +1,220 @@ + + + + + + + + +Nesterov Adam optimizer — optimizer_nadam • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Much like Adam is essentially RMSprop with momentum, Nadam is Adam RMSprop +with Nesterov momentum. See Incorporating Nesterov Momentum intoAdam.

    + + +
    optimizer_nadam(lr = 0.002, beta_1 = 0.9, beta_2 = 0.999,
    +  epsilon = 1e-08, schedule_decay = 0.004, clipnorm = NULL,
    +  clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    beta_1

    The exponential decay rate for the 1st moment estimates. float, +0 < beta < 1. Generally close to 1.

    beta_2

    The exponential decay rate for the 2nd moment estimates. float, +0 < beta < 1. Generally close to 1.

    epsilon

    float >= 0. Fuzz factor.

    schedule_decay

    Schedule deacy.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    Details

    + +

    Default parameters follow those provided in the paper. It is +recommended to leave the parameters of this optimizer at their default +values.

    + +

    See also

    + +

    On the importance of initialization and momentum in deeplearning.

    +

    Other optimizers: optimizer_adadelta, + optimizer_adagrad, + optimizer_adamax, + optimizer_adam, + optimizer_rmsprop, + optimizer_sgd

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_rmsprop.html b/website/reference/optimizer_rmsprop.html new file mode 100644 index 000000000..83aa7f59b --- /dev/null +++ b/website/reference/optimizer_rmsprop.html @@ -0,0 +1,211 @@ + + + + + + + + +RMSProp optimizer — optimizer_rmsprop • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    RMSProp optimizer

    + + +
    optimizer_rmsprop(lr = 0.001, rho = 0.9, epsilon = 1e-08, decay = 0,
    +  clipnorm = NULL, clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    rho

    float >= 0. Decay factor.

    epsilon

    float >= 0. Fuzz factor.

    decay

    float >= 0. Learning rate decay over each update.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    Note

    + +

    It is recommended to leave the parameters of this optimizer at their +default values (except the learning rate, which can be freely tuned).

    +

    This optimizer is usually a good choice for recurrent neural networks.

    + +

    See also

    + +

    Other optimizers: optimizer_adadelta, + optimizer_adagrad, + optimizer_adamax, + optimizer_adam, + optimizer_nadam, + optimizer_sgd

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/optimizer_sgd.html b/website/reference/optimizer_sgd.html new file mode 100644 index 000000000..90e13eae5 --- /dev/null +++ b/website/reference/optimizer_sgd.html @@ -0,0 +1,210 @@ + + + + + + + + +Stochastic gradient descent optimizer — optimizer_sgd • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Stochastic gradient descent optimizer with support for momentum, learning +rate decay, and Nesterov momentum.

    + + +
    optimizer_sgd(lr = 0.01, momentum = 0, decay = 0, nesterov = FALSE,
    +  clipnorm = NULL, clipvalue = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    lr

    float >= 0. Learning rate.

    momentum

    float >= 0. Parameter updates momentum.

    decay

    float >= 0. Learning rate decay over each update.

    nesterov

    boolean. Whether to apply Nesterov momentum.

    clipnorm

    Gradients will be clipped when their L2 norm exceeds this +value.

    clipvalue

    Gradients will be clipped when their absolute value exceeds +this value.

    + +

    Value

    + +

    Optimizer for use with compile.

    + +

    See also

    + +

    Other optimizers: optimizer_adadelta, + optimizer_adagrad, + optimizer_adamax, + optimizer_adam, + optimizer_nadam, + optimizer_rmsprop

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/pad_sequences.html b/website/reference/pad_sequences.html new file mode 100644 index 000000000..9b07de348 --- /dev/null +++ b/website/reference/pad_sequences.html @@ -0,0 +1,214 @@ + + + + + + + + +Pads each sequence to the same length (length of the longest sequence). — pad_sequences • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Pads each sequence to the same length (length of the longest sequence).

    + + +
    pad_sequences(sequences, maxlen = NULL, dtype = "int32", padding = "pre",
    +  truncating = "pre", value = 0)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    sequences

    List of lists where each element is a sequence

    maxlen

    int, maximum length

    dtype

    type to cast the resulting sequence.

    padding

    'pre' or 'post', pad either before or after each sequence.

    truncating

    'pre' or 'post', remove values from sequences larger than maxlen either in the beginning or in the end of the sequence

    value

    float, value to pad the sequences to the desired value.

    + +

    Value

    + +

    Array with dimensions (number_of_sequences, maxlen)

    + +

    Details

    + +

    If maxlen is provided, any sequence longer than maxlen is truncated to maxlen. +Truncation happens off either the beginning (default) or +the end of the sequence. Supports post-padding and pre-padding (default).

    + +

    See also

    + +

    Other text preprocessing: make_sampling_table, + skipgrams, + text_hashing_trick, + text_one_hot, + text_to_word_sequence

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/pipe.html b/website/reference/pipe.html new file mode 100644 index 000000000..a1be5e3cb --- /dev/null +++ b/website/reference/pipe.html @@ -0,0 +1,159 @@ + + + + + + + + +Pipe operator — %>% • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    See %>% for more details.

    + + +
    lhs %>% rhs
    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/plot.keras_training_history.html b/website/reference/plot.keras_training_history.html new file mode 100644 index 000000000..a045fbc6e --- /dev/null +++ b/website/reference/plot.keras_training_history.html @@ -0,0 +1,195 @@ + + + + + + + + +Plot training history — plot.keras_training_history • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Plots metrics recorded during training.

    + + +
    # S3 method for keras_training_history
    +plot(x, y, metrics = NULL,
    +  method = c("auto", "ggplot2", "base"), smooth = TRUE, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    x

    Training history object returned from fit().

    y

    Unused.

    metrics

    One or more metrics to plot (e.g. c('loss', 'accuracy')). +Defaults to plotting all captured metrics.

    method

    Method to use for plotting. The default "auto" will use +ggplot2 if available, and otherwise will use base graphics.

    smooth

    Whether a loess smooth should be added to the plot, only +available for the ggplot2 method. If the number of epochs is smaller +than ten, it is forced to false.

    ...

    Additional parameters to pass to the plot() method.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/pop_layer.html b/website/reference/pop_layer.html new file mode 100644 index 000000000..b4876502c --- /dev/null +++ b/website/reference/pop_layer.html @@ -0,0 +1,186 @@ + + + + + + + + +Remove the last layer in a model — pop_layer • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Remove the last layer in a model

    + + +
    pop_layer(object)
    + +

    Arguments

    + + + + + + +
    object

    Keras model object

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/predict.keras.engine.training.Model.html b/website/reference/predict.keras.engine.training.Model.html new file mode 100644 index 000000000..16724225e --- /dev/null +++ b/website/reference/predict.keras.engine.training.Model.html @@ -0,0 +1,210 @@ + + + + + + + + +Generate predictions from a Keras model — predict.keras.engine.training.Model • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Generates output predictions for the input samples, processing the samples in +a batched way.

    + + +
    # S3 method for keras.engine.training.Model
    +predict(object, x, batch_size = 32,
    +  verbose = 0, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    object

    Keras model

    x

    Input data (vector, matrix, or array)

    batch_size

    Integer

    verbose

    Verbosity mode, 0 or 1.

    ...

    Unused

    + +

    Value

    + +

    vector, matrix, or array of predictions

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/predict_generator.html b/website/reference/predict_generator.html new file mode 100644 index 000000000..fb6bb4fb8 --- /dev/null +++ b/website/reference/predict_generator.html @@ -0,0 +1,217 @@ + + + + + + + + +Generates predictions for the input samples from a data generator. — predict_generator • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    The generator should return the same kind of data as accepted by +predict_on_batch().

    + + +
    predict_generator(object, generator, steps, max_queue_size = 10,
    +  verbose = 0)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    object

    Keras model object

    generator

    Generator yielding batches of input samples.

    steps

    Total number of steps (batches of samples) to yield from +generator before stopping.

    max_queue_size

    Maximum size for the generator queue.

    verbose

    verbosity mode, 0 or 1.

    + +

    Value

    + +

    Numpy array(s) of predictions.

    + +

    Raises

    + +

    ValueError: In case the generator yields data in an invalid +format.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/predict_on_batch.html b/website/reference/predict_on_batch.html new file mode 100644 index 000000000..f9dd3c670 --- /dev/null +++ b/website/reference/predict_on_batch.html @@ -0,0 +1,195 @@ + + + + + + + + +Returns predictions for a single batch of samples. — predict_on_batch • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Returns predictions for a single batch of samples.

    + + +
    predict_on_batch(object, x)
    + +

    Arguments

    + + + + + + + + + + +
    object

    Keras model object

    x

    Input data (vector, matrix, or array)

    + +

    Value

    + +

    array of predictions.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_proba, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/predict_proba.html b/website/reference/predict_proba.html new file mode 100644 index 000000000..970fa546d --- /dev/null +++ b/website/reference/predict_proba.html @@ -0,0 +1,205 @@ + + + + + + + + +Generates probability or class probability predictions for the input samples. — predict_proba • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Generates probability or class probability predictions for the input samples.

    + + +
    predict_proba(object, x, batch_size = 32, verbose = 0)
    +
    +predict_classes(object, x, batch_size = 32, verbose = 0)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    object

    Keras model object

    x

    Input data (vector, matrix, or array)

    batch_size

    Integer

    verbose

    Verbosity mode, 0 or 1.

    + +

    Details

    + +

    The input samples are processed batch by batch.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + summary.keras.engine.training.Model, + train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/reexports.html b/website/reference/reexports.html new file mode 100644 index 000000000..5cad96769 --- /dev/null +++ b/website/reference/reexports.html @@ -0,0 +1,165 @@ + + + + + + + + +Objects exported from other packages — reexports • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    These objects are imported from other packages. Follow the links +below to see their documentation.

    +
    reticulate

    use_python, use_virtualenv, use_condaenv

    + +
    tensorflow

    tensorboard

    + +
    tfruns

    flags, flag_numeric, flag_integer, flag_string, flag_boolean, run_dir

    +
    + + + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/regularizer_l1.html b/website/reference/regularizer_l1.html new file mode 100644 index 000000000..fb1eb29c5 --- /dev/null +++ b/website/reference/regularizer_l1.html @@ -0,0 +1,181 @@ + + + + + + + + +L1 and L2 regularization — regularizer_l1 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    L1 and L2 regularization

    + + +
    regularizer_l1(l = 0.01)
    +
    +regularizer_l2(l = 0.01)
    +
    +regularizer_l1_l2(l1 = 0.01, l2 = 0.01)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    l

    Regularization factor.

    l1

    L1 regularization factor.

    l2

    L2 regularization factor.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/reset_states.html b/website/reference/reset_states.html new file mode 100644 index 000000000..25ad9638f --- /dev/null +++ b/website/reference/reset_states.html @@ -0,0 +1,177 @@ + + + + + + + + +Reset the states for a layer — reset_states • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Reset the states for a layer

    + + +
    reset_states(object)
    + +

    Arguments

    + + + + + + +
    object

    Model or layer object

    + +

    See also

    + +

    Other layer methods: count_params, + get_config, get_input_at, + get_weights

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/save_model_hdf5.html b/website/reference/save_model_hdf5.html new file mode 100644 index 000000000..fbf2d9435 --- /dev/null +++ b/website/reference/save_model_hdf5.html @@ -0,0 +1,225 @@ + + + + + + + + +Save/Load models using HDF5 files — save_model_hdf5 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Save/Load models using HDF5 files

    + + +
    save_model_hdf5(object, filepath, overwrite = TRUE,
    +  include_optimizer = TRUE)
    +
    +load_model_hdf5(filepath, custom_objects = NULL, compile = TRUE)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model object to save

    filepath

    File path

    overwrite

    Overwrite existing file if necessary

    include_optimizer

    If TRUE, save optimizer's state.

    custom_objects

    Mapping class names (or function names) of custom +(non-Keras) objects to class/functions

    compile

    Whether to compile the model after loading.

    + +

    Details

    + +

    The following components of the model are saved:

      +
    • The model architecture, allowing to re-instantiate the model.

    • +
    • The model weights.

    • +
    • The state of the optimizer, allowing to resume training exactly where you +left off. +This allows you to save the entirety of the state of a model +in a single file.

    • +
    +

    Saved models can be reinstantiated via load_model_hdf5(). The model returned by +load_model_hdf5() is a compiled model ready to be used (unless the saved model +was never compiled in the first place or compile = FALSE is specified.

    + +

    Note

    + +

    The serialize_model() function enables saving Keras models to +R objects that can be persisted across R sessions.

    + +

    See also

    + +

    Other model persistence: get_weights, + model_to_json, model_to_yaml, + save_model_weights_hdf5, + serialize_model

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/save_model_weights_hdf5.html b/website/reference/save_model_weights_hdf5.html new file mode 100644 index 000000000..8c2b99409 --- /dev/null +++ b/website/reference/save_model_weights_hdf5.html @@ -0,0 +1,214 @@ + + + + + + + + +Save/Load model weights using HDF5 files — save_model_weights_hdf5 • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Save/Load model weights using HDF5 files

    + + +
    save_model_weights_hdf5(object, filepath, overwrite = TRUE)
    +
    +load_model_weights_hdf5(object, filepath, by_name = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    object

    Model object to save/load

    filepath

    Path to the file

    overwrite

    Whether to silently overwrite any existing +file at the target location

    by_name

    Whether to load weights by name or by topological order.

    + +

    Details

    + +

    The weight file has:

      +
    • layer_names (attribute), a list of strings (ordered names of model layers).

    • +
    • For every layer, a group named layer.name

    • +
    • For every such layer group, a group attribute weight_names, a list of strings +(ordered names of weights tensor of the layer).

    • +
    • For every weight in the layer, a dataset storing the weight value, named after +the weight tensor.

    • +
    +

    For load_model_weights(), if by_name is FALSE (default) weights are +loaded based on the network's topology, meaning the architecture should be +the same as when the weights were saved. Note that layers that don't have +weights are not taken into account in the topological ordering, so adding +or removing layers is fine as long as they don't have weights.

    +

    If by_name is TRUE, weights are loaded into layers only if they share +the same name. This is useful for fine-tuning or transfer-learning models +where some of the layers have changed.

    + +

    See also

    + +

    Other model persistence: get_weights, + model_to_json, model_to_yaml, + save_model_hdf5, + serialize_model

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/sequences_to_matrix.html b/website/reference/sequences_to_matrix.html new file mode 100644 index 000000000..34f529c06 --- /dev/null +++ b/website/reference/sequences_to_matrix.html @@ -0,0 +1,194 @@ + + + + + + + + +Convert a list of sequences into a matrix. — sequences_to_matrix • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Convert a list of sequences into a matrix.

    + + +
    sequences_to_matrix(tokenizer, sequences, mode = c("binary", "count", "tfidf",
    +  "freq"))
    + +

    Arguments

    + + + + + + + + + + + + + + +
    tokenizer

    Tokenizer

    sequences

    List of sequences (a sequence is a list of integer word indices).

    mode

    one of "binary", "count", "tfidf", "freq".

    + +

    Value

    + +

    A matrix

    + +

    See also

    + +

    Other text tokenization: fit_text_tokenizer, + text_tokenizer, + texts_to_matrix, + texts_to_sequences_generator, + texts_to_sequences

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/serialize_model.html b/website/reference/serialize_model.html new file mode 100644 index 000000000..ffec24bb4 --- /dev/null +++ b/website/reference/serialize_model.html @@ -0,0 +1,210 @@ + + + + + + + + +Serialize a model to an R object — serialize_model • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Model objects are external references to Keras objects which cannot be saved +and restored across R sessions. The serialize_model() and +unserialize_model() functions provide facilities to convert Keras models to +R objects for persistence within R data files.

    + + +
    serialize_model(model, include_optimizer = TRUE)
    +
    +unserialize_model(model, custom_objects = NULL, compile = TRUE)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    model

    Keras model or R "raw" object containing serialized Keras model.

    include_optimizer

    If TRUE, save optimizer's state.

    custom_objects

    Mapping class names (or function names) of custom +(non-Keras) objects to class/functions

    compile

    Whether to compile the model after loading.

    + +

    Value

    + +

    serialize_model() returns an R "raw" object containing an hdf5 +version of the Keras model. unserialize_model() returns a Keras model.

    + +

    Note

    + +

    The save_model_hdf5() function enables saving Keras models to +external hdf5 files.

    + +

    See also

    + +

    Other model persistence: get_weights, + model_to_json, model_to_yaml, + save_model_hdf5, + save_model_weights_hdf5

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/skipgrams.html b/website/reference/skipgrams.html new file mode 100644 index 000000000..746d7c898 --- /dev/null +++ b/website/reference/skipgrams.html @@ -0,0 +1,232 @@ + + + + + + + + +Generates skipgram word pairs. — skipgrams • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Takes a sequence (list of indexes of words), returns list of couples (word_index, +other_word index) and labels (1s or 0s), where label = 1 if 'other_word' +belongs to the context of 'word', and label=0 if 'other_word' is randomly +sampled

    + + +
    skipgrams(sequence, vocabulary_size, window_size = 4, negative_samples = 1,
    +  shuffle = TRUE, categorical = FALSE, sampling_table = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    sequence

    a word sequence (sentence), encoded as a list of word indices +(integers). If using a sampling_table, word indices are expected to match +the rank of the words in a reference dataset (e.g. 10 would encode the +10-th most frequently occuring token). Note that index 0 is expected to be +a non-word and will be skipped.

    vocabulary_size

    int. maximum possible word index + 1

    window_size

    int. actually half-window. The window of a word wi will be +[i-window_size, i+window_size+1]

    negative_samples

    float >= 0. 0 for no negative (=random) samples. 1 +for same number as positive samples. etc.

    shuffle

    whether to shuffle the word couples before returning them.

    categorical

    bool. if FALSE, labels will be integers (eg. [0, 1, 1 .. ]), +if TRUE labels will be categorical eg. [[1,0],[0,1],[0,1] .. ]

    +

    [[1,0]: R:[1,0 +[0,1]: R:0,1 +[0,1]: R:0,1

    sampling_table

    1D array of size vocabulary_size where the entry i +encodes the probabibily to sample a word of rank i.

    + +

    Value

    + +

    List of couples, labels where:

      +
    • couples is a list of 2-element integer vectors: [word_index, other_word_index].

    • +
    • labels is an integer vector of 0 and 1, where 1 indicates that other_word_index +was found in the same window as word_index, and 0 indicates that other_word_index +was random.

    • +
    • if categorical is set to TRUE, the labels are categorical, ie. 1 becomes [0,1], +and 0 becomes [1, 0].

    • +
    + + +

    See also

    + +

    Other text preprocessing: make_sampling_table, + pad_sequences, + text_hashing_trick, + text_one_hot, + text_to_word_sequence

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/summary.keras.engine.training.Model.html b/website/reference/summary.keras.engine.training.Model.html new file mode 100644 index 000000000..b224ceb62 --- /dev/null +++ b/website/reference/summary.keras.engine.training.Model.html @@ -0,0 +1,199 @@ + + + + + + + + +Print a summary of a Keras model — summary.keras.engine.training.Model • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Print a summary of a Keras model

    + + +
    # S3 method for keras.engine.training.Model
    +summary(object,
    +  line_length = getOption("width"), positions = NULL, ...)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    object

    Keras model instance

    line_length

    Total length of printed lines

    positions

    Relative or absolute positions of log elements in each line. +If not provided, defaults to c(0.33, 0.55, 0.67, 1.0).

    ...

    Unused

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, train_on_batch

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/text_hashing_trick.html b/website/reference/text_hashing_trick.html new file mode 100644 index 000000000..1a1bf3a22 --- /dev/null +++ b/website/reference/text_hashing_trick.html @@ -0,0 +1,216 @@ + + + + + + + + +Converts a text to a sequence of indexes in a fixed-size hashing space. — text_hashing_trick • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Converts a text to a sequence of indexes in a fixed-size hashing space.

    + + +
    text_hashing_trick(text, n, hash_function = NULL,
    +  filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n", lower = TRUE,
    +  split = " ")
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    text

    Input text (string).

    n

    Dimension of the hashing space.

    hash_function

    if NULL uses python hash function, can be 'md5' or +any function that takes in input a string and returns a int. Note that +hash is not a stable hashing function, so it is not consistent across +different runs, while 'md5' is a stable hashing function.

    filters

    Sequence of characters to filter out.

    lower

    Whether to convert the input to lowercase.

    split

    Sentence split marker (string).

    + +

    Value

    + +

    A list of integer word indices (unicity non-guaranteed).

    + +

    Details

    + +

    Two or more words may be assigned to the same index, due to possible +collisions by the hashing function.

    + +

    See also

    + +

    Other text preprocessing: make_sampling_table, + pad_sequences, skipgrams, + text_one_hot, + text_to_word_sequence

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/text_one_hot.html b/website/reference/text_one_hot.html new file mode 100644 index 000000000..93a441cd8 --- /dev/null +++ b/website/reference/text_one_hot.html @@ -0,0 +1,202 @@ + + + + + + + + +One-hot encode a text into a list of word indexes in a vocabulary of size n. — text_one_hot • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    One-hot encode a text into a list of word indexes in a vocabulary of size n.

    + + +
    text_one_hot(text, n, filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n",
    +  lower = TRUE, split = " ")
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    text

    Input text (string).

    n

    Size of vocabulary (integer)

    filters

    Sequence of characters to filter out.

    lower

    Whether to convert the input to lowercase.

    split

    Sentence split marker (string).

    + +

    Value

    + +

    List of integers in [1, n]. Each integer encodes a word (unicity +non-guaranteed).

    + +

    See also

    + +

    Other text preprocessing: make_sampling_table, + pad_sequences, skipgrams, + text_hashing_trick, + text_to_word_sequence

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/text_to_word_sequence.html b/website/reference/text_to_word_sequence.html new file mode 100644 index 000000000..c9df8e764 --- /dev/null +++ b/website/reference/text_to_word_sequence.html @@ -0,0 +1,198 @@ + + + + + + + + +Convert text to a sequence of words (or tokens). — text_to_word_sequence • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Convert text to a sequence of words (or tokens).

    + + +
    text_to_word_sequence(text,
    +  filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n", lower = TRUE,
    +  split = " ")
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    text

    Input text (string).

    filters

    Sequence of characters to filter out.

    lower

    Whether to convert the input to lowercase.

    split

    Sentence split marker (string).

    + +

    Value

    + +

    Words (or tokens)

    + +

    See also

    + +

    Other text preprocessing: make_sampling_table, + pad_sequences, skipgrams, + text_hashing_trick, + text_one_hot

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/text_tokenizer.html b/website/reference/text_tokenizer.html new file mode 100644 index 000000000..436509120 --- /dev/null +++ b/website/reference/text_tokenizer.html @@ -0,0 +1,229 @@ + + + + + + + + +Text tokenization utility — text_tokenizer • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Vectorize a text corpus, by turning each text into either a sequence of +integers (each integer being the index of a token in a dictionary) or into a +vector where the coefficient for each token could be binary, based on word +count, based on tf-idf...

    + + +
    text_tokenizer(num_words = NULL,
    +  filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n", lower = TRUE,
    +  split = " ", char_level = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    num_words

    the maximum number of words to keep, based on word +frequency. Only the most common num_words words will be kept.

    filters

    a string where each element is a character that will be +filtered from the texts. The default is all punctuation, plus tabs and line +breaks, minus the ' character.

    lower

    boolean. Whether to convert the texts to lowercase.

    split

    character or string to use for token splitting.

    char_level

    if TRUE, every character will be treated as a token

    + +

    Details

    + +

    By default, all punctuation is removed, turning the texts into +space-separated sequences of words (words maybe include the ' character). +These sequences are then split into lists of tokens. They will then be +indexed or vectorized. 0 is a reserved index that won't be assigned to any +word.

    + +

    Attributes

    + + +

    The tokenizer object has the following attributes:

      +
    • word_counts --- named list mapping words to the number of times they appeared +on during fit. Only set after fit_text_tokenizer() is called on the tokenizer.

    • +
    • word_docs --- named list mapping words to the number of documents/texts they +appeared on during fit. Only set after fit_text_tokenizer() is called on the tokenizer.

    • +
    • word_index --- named list mapping words to their rank/index (int). Only set +after fit_text_tokenizer() is called on the tokenizer.

    • +
    • document_count --- int. Number of documents (texts/sequences) the tokenizer +was trained on. Only set after fit_text_tokenizer() is called on the tokenizer.

    • +
    + +

    See also

    + +

    Other text tokenization: fit_text_tokenizer, + sequences_to_matrix, + texts_to_matrix, + texts_to_sequences_generator, + texts_to_sequences

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/texts_to_matrix.html b/website/reference/texts_to_matrix.html new file mode 100644 index 000000000..02ebeeaa2 --- /dev/null +++ b/website/reference/texts_to_matrix.html @@ -0,0 +1,194 @@ + + + + + + + + +Convert a list of texts to a matrix. — texts_to_matrix • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Convert a list of texts to a matrix.

    + + +
    texts_to_matrix(tokenizer, texts, mode = c("binary", "count", "tfidf",
    +  "freq"))
    + +

    Arguments

    + + + + + + + + + + + + + + +
    tokenizer

    Tokenizer

    texts

    Vector/list of texts (strings).

    mode

    one of "binary", "count", "tfidf", "freq".

    + +

    Value

    + +

    A matrix

    + +

    See also

    + +

    Other text tokenization: fit_text_tokenizer, + sequences_to_matrix, + text_tokenizer, + texts_to_sequences_generator, + texts_to_sequences

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/texts_to_sequences.html b/website/reference/texts_to_sequences.html new file mode 100644 index 000000000..f78626d56 --- /dev/null +++ b/website/reference/texts_to_sequences.html @@ -0,0 +1,184 @@ + + + + + + + + +Transform each text in texts in a sequence of integers. — texts_to_sequences • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Only top "num_words" most frequent words will be taken into account. +Only words known by the tokenizer will be taken into account.

    + + +
    texts_to_sequences(tokenizer, texts)
    + +

    Arguments

    + + + + + + + + + + +
    tokenizer

    Tokenizer

    texts

    Vector/list of texts (strings).

    + +

    See also

    + +

    Other text tokenization: fit_text_tokenizer, + sequences_to_matrix, + text_tokenizer, + texts_to_matrix, + texts_to_sequences_generator

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/texts_to_sequences_generator.html b/website/reference/texts_to_sequences_generator.html new file mode 100644 index 000000000..7be73939b --- /dev/null +++ b/website/reference/texts_to_sequences_generator.html @@ -0,0 +1,190 @@ + + + + + + + + +Transforms each text in texts in a sequence of integers. — texts_to_sequences_generator • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Only top "num_words" most frequent words will be taken into account. +Only words known by the tokenizer will be taken into account.

    + + +
    texts_to_sequences_generator(tokenizer, texts)
    + +

    Arguments

    + + + + + + + + + + +
    tokenizer

    Tokenizer

    texts

    Vector/list of texts (strings).

    + +

    Value

    + +

    Generator which yields individual sequences

    + +

    See also

    + +

    Other text tokenization: fit_text_tokenizer, + sequences_to_matrix, + text_tokenizer, + texts_to_matrix, + texts_to_sequences

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/time_distributed.html b/website/reference/time_distributed.html new file mode 100644 index 000000000..60ee3f30a --- /dev/null +++ b/website/reference/time_distributed.html @@ -0,0 +1,228 @@ + + + + + + + + +Apply a layer to every temporal slice of an input. — time_distributed • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    The input should be at least 3D, and the dimension of index one will be +considered to be the temporal dimension.

    + + +
    time_distributed(object, layer, input_shape = NULL,
    +  batch_input_shape = NULL, batch_size = NULL, dtype = NULL,
    +  name = NULL, trainable = NULL, weights = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    object

    Model or layer object

    layer

    A layer instance.

    input_shape

    Dimensionality of the input (integer) not including the +samples axis. This argument is required when using this layer as the first +layer in a model.

    batch_input_shape

    Shapes, including the batch size. For instance, +batch_input_shape=c(10, 32) indicates that the expected input will be +batches of 10 32-dimensional vectors. batch_input_shape=list(NULL, 32) +indicates batches of an arbitrary number of 32-dimensional vectors.

    batch_size

    Fixed batch size for layer

    dtype

    The data type expected by the input, as a string (float32, +float64, int32...)

    name

    An optional name string for the layer. Should be unique in a +model (do not reuse the same name twice). It will be autogenerated if it +isn't provided.

    trainable

    Whether the layer weights will be updated during training.

    weights

    Initial weights for layer.

    + +

    Details

    + +

    Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch +input shape of the layer is then (32, 10, 16), and the input_shape, not +including the samples dimension, is (10, 16). You can then use +time_distributed to apply a layer_dense to each of the 10 timesteps, +independently.

    + +

    See also

    + +

    Other layer wrappers: bidirectional

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/to_categorical.html b/website/reference/to_categorical.html new file mode 100644 index 000000000..40ca11346 --- /dev/null +++ b/website/reference/to_categorical.html @@ -0,0 +1,185 @@ + + + + + + + + +Converts a class vector (integers) to binary class matrix. — to_categorical • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Converts a class vector (integers) to binary class matrix.

    + + +
    to_categorical(y, num_classes = NULL)
    + +

    Arguments

    + + + + + + + + + + +
    y

    Class vector to be converted into a matrix (integers from 0 to num_classes).

    num_classes

    Total number of classes.

    + +

    Value

    + +

    A binary matrix representation of the input.

    + +

    Details

    + +

    E.g. for use with loss_categorical_crossentropy().

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/to_numpy_array.html b/website/reference/to_numpy_array.html new file mode 100644 index 000000000..e0ccc8bb4 --- /dev/null +++ b/website/reference/to_numpy_array.html @@ -0,0 +1,188 @@ + + + + + + + + +Convert to NumPy Array — to_numpy_array • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Convert an object to a NumPy array which has the optimal in-memory layout and +floating point data type for the current Keras backend.

    + + +
    to_numpy_array(x, dtype = NULL, order = "C")
    + +

    Arguments

    + + + + + + + + + + + + + + +
    x

    Object or list of objects to convert

    dtype

    NumPy data type (e.g. float32, float64). If this is unspecified +then R doubles will be converted to the default floating point type for the +current Keras backend.

    order

    In-memory order ('C' or 'F'). Defaults to 'C', which is the +optimal order in nearly every case for Keras backends.

    + +

    Value

    + +

    NumPy array with the specified type and order (or list of NumPy +arrays if a list was passed for x).

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown.

    +
    + +
    +
    + + + diff --git a/website/reference/train_on_batch.html b/website/reference/train_on_batch.html new file mode 100644 index 000000000..04b05f5b1 --- /dev/null +++ b/website/reference/train_on_batch.html @@ -0,0 +1,213 @@ + + + + + + + + +Single gradient update or model evaluation over one batch of samples. — train_on_batch • keras + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + + +

    Single gradient update or model evaluation over one batch of samples.

    + + +
    train_on_batch(object, x, y, class_weight = NULL, sample_weight = NULL)
    +
    +test_on_batch(object, x, y, sample_weight = NULL)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + + + + + +
    object

    Keras model object

    x

    input data, as an array or list of arrays (if the model has multiple +inputs).

    y

    labels, as an array.

    class_weight

    named list mapping classes to a weight value, used for +scaling the loss function (during training only).

    sample_weight

    sample weights, as an array.

    + +

    Value

    + +

    Scalar training or test loss (if the model has no metrics) or list of scalars +(if the model computes other metrics). The property model$metrics_names +will give you the display labels for the scalar outputs.

    + +

    See also

    + +

    Other model functions: compile, + evaluate_generator, evaluate, + fit_generator, fit, + get_config, get_layer, + keras_model_sequential, + keras_model, pop_layer, + predict.keras.engine.training.Model, + predict_generator, + predict_on_batch, + predict_proba, + summary.keras.engine.training.Model

    + + +
    + +
    + + +
    + + +