Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ After defining the model, we need to define a few more things: our loss, our tra

Loss function is used to calculate how the output of the network differs from the ground truth. Because classes of the logistic regression are either 0 or 1, we are using [SigmoidBinaryCrossEntropyLoss](../../api/gluon/loss/index.rst#mxnet.gluon.loss.SigmoidBinaryCrossEntropyLoss). Notice that we do not specify `from_sigmoid` attribute in the code, which means that the output of the neuron doesn't need to go through sigmoid, but at inference we'd have to pass it through sigmoid. You can learn more about cross entropy on [wikipedia](https://en.wikipedia.org/wiki/Cross_entropy).

Trainer object allows to specify the method of training to be used. For our tutorial we use [Stochastic Gradient Descent (SGD)](../../api/optimizer/index.rst#mxnet.optimizer.SGD). For more information on SGD refer to [the following tutorial](https://gluon.mxnet.io/chapter06_optimization/gd-sgd-scratch.html). We also need to parametrize it with learning rate value, which defines the weight updates, and weight decay, which is used for regularization.
Trainer object allows to specify the method of training to be used. For our tutorial we use [Stochastic Gradient Descent (SGD)](../../api/optimizer/index.rst#mxnet.optimizer.SGD). For more information on SGD refer to [the following tutorial](https://d2l.ai/chapter_optimization/sgd.html). We also need to parametrize it with learning rate value, which defines the weight updates, and weight decay, which is used for regularization.

Metric helps us to estimate how good our model is in terms of a problem we are trying to solve. Where loss function has more importance for the training process, a metric is usually the thing we are trying to improve and reach maximum value. We also can use more than one metric, to measure various aspects of our model. In our example, we are using [Accuracy](../../api/gluon/metric/index.rst#mxnet.gluon.metric.Accuracy) and [F1 score](../../api/gluon/metric/index.rst#mxnet.gluon.metric.F1) as measurements of success of our model.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
# Image similarity search with InfoGAN

This notebook shows how to implement an InfoGAN based on Gluon. InfoGAN is an extension of GANs, where the generator input is split in 2 parts: random noise and a latent code (see [InfoGAN Paper](https://arxiv.org/pdf/1606.03657.pdf)).
The codes are made meaningful by maximizing the mutual information between code and generator output. InfoGAN learns a disentangled representation in a completely unsupervised manner. It can be used for many applications such as image similarity search. This notebook uses the DCGAN example from the [Straight Dope Book](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html) and extends it to create an InfoGAN.
The codes are made meaningful by maximizing the mutual information between code and generator output. InfoGAN learns a disentangled representation in a completely unsupervised manner. It can be used for many applications such as image similarity search. This notebook uses the DCGAN example and extends it to create an InfoGAN.


```{.python .input}
Expand Down Expand Up @@ -112,7 +112,7 @@ train_dataloader = gluon.data.DataLoader(train_data, batch_size=batch_size, shuf
```

## Generator
Define the Generator model. Architecture is taken from the DCGAN implementation in [Straight Dope Book](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html). The Generator consist of 4 layers where each layer involves a strided convolution, batch normalization, and rectified nonlinearity. It takes as input random noise and the latent code and produces an `(64,64,3)` output image.
Define the Generator model. The Generator consist of 4 layers where each layer involves a strided convolution, batch normalization, and rectified nonlinearity. It takes as input random noise and the latent code and produces an `(64,64,3)` output image.


```{.python .input}
Expand Down
2 changes: 1 addition & 1 deletion python/mxnet/gluon/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ class Trainer(object):
The set of parameters to optimize.
optimizer : str or Optimizer
The optimizer to use. See
`help <https://mxnet.apache.org/api/python/docs/api/optimizer/index.html#mxnet.optimizer.Optimizer>`_
`help <https://mxnet.apache.org/api/python/docs/api/optimizer/index.html>`_
on Optimizer for a list of available optimizers.
optimizer_params : dict
Key-word arguments to be passed to optimizer constructor. For example,
Expand Down