This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
SVRG optimization in python/contrib package, this version supports si…
…ngle machine single cpu, single gpu and multi-gpus
- Loading branch information
1 parent
6fdfd89
commit 4a2b644
Showing
23 changed files
with
792 additions
and
579 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
Empty file.
This file was deleted.
Oops, something went wrong.
116 changes: 0 additions & 116 deletions
116
contrib/svrg_optimization_python/tests/test_svrg_module.py
This file was deleted.
Oops, something went wrong.
96 changes: 0 additions & 96 deletions
96
contrib/svrg_optimization_python/tests/test_svrg_optimizer.py
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# SVRG Optimization in Python Module API | ||
|
||
## Overview | ||
SVRG which stands for Stochastic Variance Reduced Gradients, is an optimization technique that complements SGD. It | ||
employs explicit variance reduction and converges much faster compared to SGD for smooth and strongly convex functions. | ||
|
||
SVRG optimization is implemented as a SVRGModule in `mxnet.contrib.svrg_optimization`, which is an extension of the | ||
existing `mxnet.module.Module` APIs and encapsulates SVRG optimization logic within several new functions. SVRGModule | ||
API changes compared to Module API to end users are minimal. | ||
|
||
The current `SVRGModule` implements the standard SVRG optimization technique as described in _Accelerating Stochastic | ||
Gradient Descent using Predicative Variance Reduction_ by calculating the gradients of all data | ||
every `update_freq` epochs in the training. The SVRGModule update rule: gradients w.r.t current parameters minus gradients w.r.t parameters | ||
from the last mth epoch, plus the average of gradients over all data. | ||
|
||
`SVRGOptimizer` wraps two optimizers, an AssignmentOptimizer which is used for full gradients accumulation in the KVStore and | ||
a regular optimizer which is specified as a parameter to the `mod.init_optimizer()`. | ||
|
||
```eval_rst | ||
.. warning:: This package contains experimental APIs and may change in the near future. | ||
``` | ||
|
||
This document lists the svrg_optimization APIs in mxnet: | ||
|
||
```eval_rst | ||
.. autosummary:: | ||
:nosignatures: | ||
mxnet.contrib.svrg_optimization.SVRGModule | ||
mxnet.contrib.svrg_optimization.SVRGOptimizer | ||
``` | ||
|
||
### Intermediate Level API for SVRGModule | ||
|
||
The only extra step to use a SVRGModule compared to use a Module is to check if the current epoch should update the | ||
full gradients over all data. Code snippets below demonstrate the suggested usage of SVRGModule using intermediate | ||
level APIs. | ||
|
||
```python | ||
>>> mod = SVRGModule(symbol=model, update_frequency=2, data_names=['data'], label_names=['lin_reg_label']) | ||
>>> mod.bind(data_shapes=di.provide_data, label_shapes=di.provide_label) | ||
>>> mod.init_params() | ||
>>> mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.01), )) | ||
>>> for epoch in range(num_epochs): | ||
... if epoch % mod.update_freq == 0: | ||
... mod.update_full_grads(di) | ||
... di.reset() | ||
... for batch in di: | ||
... mod.forward_backward(data_batch=batch) | ||
... mod.update() | ||
``` | ||
|
||
### High Level API for SVRGModule | ||
|
||
The high level API usage of SVRGModule remains exactly the same as Module API. Code snippets below gives an example of | ||
suggested usage of high level API. | ||
|
||
```python | ||
>>> mod = SVRGModule(symbol=model, update_frequency=2, data_names=['data'], label_names=['lin_reg_label']) | ||
>>> mod.fit(di, num_epochs=100, optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), num_epochs=100) | ||
``` | ||
|
||
## API reference | ||
|
||
<script type="text/javascript" src='../../../_static/js/auto_module_index.js'></script> | ||
|
||
```eval_rst | ||
.. automodule:: mxnet.contrib.svrg_optimization.svrg_module | ||
:members: init_optimizer, _create_optimizer, bind, forward, backward, update, update_full_grads, | ||
_accumulate_kvstore, _allocate_gradients, _svrg_grads_update_rule, update_svrg_gradients, fit, prepare | ||
.. automodule:: mxnet.contrib.svrg_optimization.svrg_optimizer.SVRGOptimizer | ||
:members: _check_params, update, create_state, _check_index | ||
.. automodule:: mxnet.contrib.svrg_optimization.svrg_optimizer.AssignmentOptimizer | ||
:members: update | ||
``` | ||
<script>auto_index("api-reference");</script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.