This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'upstream/master' into bugfix/dtype-cast
- Loading branch information
Showing
58 changed files
with
2,959 additions
and
207 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
|
||
import os | ||
import contextlib | ||
import logging | ||
import requests | ||
|
||
def get_mxnet_root() -> str: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
# SVRG Optimization in Python Module API | ||
|
||
## Overview | ||
SVRG which stands for Stochastic Variance Reduced Gradients, is an optimization technique that was first introduced in | ||
paper _Accelerating Stochastic Gradient Descent using Predictive Variance Reduction_ in 2013. It is complement to SGD | ||
(Stochastic Gradient Descent), which is known for large scale optimization but suffers from slow convergence | ||
asymptotically due to its inherent variance. SGD approximates the full gradients using a small batch of data or | ||
a single data sample, which will introduce variance and thus requires to start with a small learning rate in order to | ||
ensure convergence. SVRG remedies the problem by keeping track of a version of estimated weights that close to the | ||
optimal parameter values and maintaining an average of full gradients over a full pass of data. The average of full | ||
gradients is calculated with respect to the weights from the last m-th epochs in the training. SVRG uses a different | ||
update rule: gradients w.r.t current parameter values minus gradients w.r.t to parameters from the last m-th epochs | ||
plus the average of full gradients over all data. | ||
|
||
Key Characteristics of SVRG: | ||
* Employs explicit variance reduction by using a different update rule compared to SGD. | ||
* Ability to use relatively large learning rate, which leads to faster convergence compared to SGD. | ||
* Guarantees for fast convergence for smooth and strongly convex functions. | ||
|
||
SVRG optimization is implemented as a SVRGModule in `mxnet.contrib.svrg_optimization`, which is an extension of the | ||
existing `mxnet.module.Module` APIs and encapsulates SVRG optimization logic within several new functions. SVRGModule | ||
API changes compared to Module API to end users are minimal. | ||
|
||
In distributed training, each worker gets the same special weights from the last m-th epoch and calculates the full | ||
gradients with respect to its own shard of data. The standard SVRG optimization requires building a global full | ||
gradients, which is calculated by aggregating the full gradients from each worker and averaging over the number of | ||
workers. The workaround is to keep an additional set of keys in the KVStore that maps to full gradients. | ||
The `_SVRGOptimizer` is designed to wrap two optimizers, an `_AssignmentOptimizer` which is used for full gradients | ||
accumulation in the KVStore and a regular optimizer that performs actual update rule to the parameters. | ||
The `_SVRGOptimizer` and `_AssignmentOptimizer` are designed to be used in `SVRGModule` only. | ||
|
||
```eval_rst | ||
.. warning:: This package contains experimental APIs and may change in the near future. | ||
``` | ||
|
||
This document lists the SVRGModule APIs in MXNet/Contrib package: | ||
|
||
```eval_rst | ||
.. autosummary:: | ||
:nosignatures: | ||
mxnet.contrib.svrg_optimization.svrg_module | ||
``` | ||
|
||
### Intermediate Level API for SVRGModule | ||
|
||
The only extra step to use a SVRGModule compared to use a Module is to check if the current epoch should update the | ||
full gradients over all data. Code snippets below demonstrate the suggested usage of SVRGModule using intermediate | ||
level APIs. | ||
|
||
```python | ||
>>> mod = SVRGModule(symbol=model, update_freq=2, data_names=['data'], label_names=['lin_reg_label']) | ||
>>> mod.bind(data_shapes=di.provide_data, label_shapes=di.provide_label) | ||
>>> mod.init_params() | ||
>>> mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), kvstore='local') | ||
>>> for epoch in range(num_epochs): | ||
... if epoch % mod.update_freq == 0: | ||
... mod.update_full_grads(di) | ||
... di.reset() | ||
... for batch in di: | ||
... mod.forward_backward(data_batch=batch) | ||
... mod.update() | ||
``` | ||
|
||
### High Level API for SVRGModule | ||
|
||
The high level API usage of SVRGModule remains exactly the same as Module API. Code snippets below gives an example of | ||
suggested usage of high level API. | ||
|
||
```python | ||
>>> mod = SVRGModule(symbol=model, update_freq=2, data_names=['data'], label_names=['lin_reg_label']) | ||
>>> mod.fit(di, num_epochs=100, optimizer='sgd', optimizer_params=(('learning_rate', 0.01), )) | ||
``` | ||
|
||
## API reference | ||
|
||
<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script> | ||
|
||
```eval_rst | ||
.. automodule:: mxnet.contrib.svrg_optimization.svrg_module | ||
.. autoclass:: mxnet.contrib.svrg_optimization.svrg_module.SVRGModule | ||
:members: init_optimizer, bind, forward, backward, reshape, update, update_full_grads, fit, prepare | ||
``` | ||
<script>auto_index("api-reference");</script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Test Utilities | ||
|
||
This module has a variety of tools that help using and testing MXNet. | ||
|
||
```eval_rst | ||
.. currentmodule:: mxnet.test_utils | ||
``` | ||
|
||
```eval_rst | ||
.. autosummary:: | ||
:nosignatures: | ||
mxnet.test_utils | ||
``` | ||
|
||
## API Reference | ||
|
||
<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script> | ||
|
||
```eval_rst | ||
.. automodule:: mxnet.test_utils | ||
:members: | ||
``` | ||
|
||
<script>auto_index("api-reference");</script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Visualization | ||
|
||
This module contains visualization features. | ||
|
||
```eval_rst | ||
.. currentmodule:: mxnet.visualization | ||
``` | ||
|
||
```eval_rst | ||
.. autosummary:: | ||
:nosignatures: | ||
mxnet.visualization | ||
``` | ||
|
||
## API Reference | ||
|
||
<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script> | ||
|
||
```eval_rst | ||
.. automodule:: mxnet.visualization | ||
:members: | ||
``` | ||
|
||
<script>auto_index("api-reference");</script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Spectral Normalization GAN | ||
|
||
This example implements [Spectral Normalization for Generative Adversarial Networks](https://arxiv.org/abs/1802.05957) based on [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset. | ||
|
||
## Usage | ||
|
||
Example runs and the results: | ||
|
||
```python | ||
python train.py --use-gpu --data-path=data | ||
``` | ||
|
||
* Note that the program would download the CIFAR10 for you | ||
|
||
`python train.py --help` gives the following arguments: | ||
|
||
```bash | ||
optional arguments: | ||
-h, --help show this help message and exit | ||
--data-path DATA_PATH | ||
path of data. | ||
--batch-size BATCH_SIZE | ||
training batch size. default is 64. | ||
--epochs EPOCHS number of training epochs. default is 100. | ||
--lr LR learning rate. default is 0.0001. | ||
--lr-beta LR_BETA learning rate for the beta in margin based loss. | ||
default is 0.5. | ||
--use-gpu use gpu for training. | ||
--clip_gr CLIP_GR Clip the gradient by projecting onto the box. default | ||
is 10.0. | ||
--z-dim Z_DIM dimension of the latent z vector. default is 100. | ||
``` | ||
## Result | ||
![SN-GAN](sn_gan_output.png) | ||
## Learned Spectral Normalization | ||
![alt text](https://github.com/taki0112/Spectral_Normalization-Tensorflow/blob/master/assests/sn.png) | ||
## Reference | ||
[Simple Tensorflow Implementation](https://github.com/taki0112/Spectral_Normalization-Tensorflow) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
# This example is inspired by https://github.com/jason71995/Keras-GAN-Library, | ||
# https://github.com/kazizzad/DCGAN-Gluon-MxNet/blob/master/MxnetDCGAN.ipynb | ||
# https://github.com/apache/incubator-mxnet/blob/master/example/gluon/dcgan.py | ||
|
||
import numpy as np | ||
|
||
import mxnet as mx | ||
from mxnet import gluon | ||
from mxnet.gluon.data.vision import CIFAR10 | ||
|
||
IMAGE_SIZE = 64 | ||
|
||
def transformer(data, label): | ||
""" data preparation """ | ||
data = mx.image.imresize(data, IMAGE_SIZE, IMAGE_SIZE) | ||
data = mx.nd.transpose(data, (2, 0, 1)) | ||
data = data.astype(np.float32) / 128.0 - 1 | ||
return data, label | ||
|
||
|
||
def get_training_data(batch_size): | ||
""" helper function to get dataloader""" | ||
return gluon.data.DataLoader( | ||
CIFAR10(train=True, transform=transformer), | ||
batch_size=batch_size, shuffle=True, last_batch='discard') |
Oops, something went wrong.