From 6e94e59afd8ff518c4f793b985d66193e0ffdc06 Mon Sep 17 00:00:00 2001 From: Fredrik Bagge Carlson Date: Tue, 3 Dec 2019 15:27:44 +0800 Subject: [PATCH 1/2] Improve docs for decay optimisers --- src/optimise/optimisers.jl | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/optimise/optimisers.jl b/src/optimise/optimisers.jl index c9c40764be..888d308774 100644 --- a/src/optimise/optimisers.jl +++ b/src/optimise/optimisers.jl @@ -444,7 +444,8 @@ end """ InvDecay(γ) -Applies inverse time decay to an optimiser +Applies inverse time decay to an optimiser, i.e., the step effective step size at iteration `n` is `eta / (1 + γ * n)` where `eta` is the initial step size. The wrapped optimisers step size is not modified. +``` ## Parameters - gamma (γ): Defaults to `0.001` @@ -472,7 +473,7 @@ end """ ExpDecay(eta, decay, decay_step, clip) -Discount the learning rate `eta` by `decay` every `decay_step` till a minimum of `clip`. +Discount the learning rate `eta` by `decay` every `decay_step` till a minimum of `clip`. The wrapped optimisers step size is being modified by the outer optimiser. ## Parameters - Learning Rate (eta): Defaults to `0.001`. From e67f09c06d73bc8e0b0702732f63a77eee26e151 Mon Sep 17 00:00:00 2001 From: Fredrik Bagge Carlson Date: Tue, 3 Dec 2019 15:32:23 +0800 Subject: [PATCH 2/2] Correct some comments in decay docs --- src/optimise/optimisers.jl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/optimise/optimisers.jl b/src/optimise/optimisers.jl index 888d308774..fb3b9fc534 100644 --- a/src/optimise/optimisers.jl +++ b/src/optimise/optimisers.jl @@ -444,7 +444,7 @@ end """ InvDecay(γ) -Applies inverse time decay to an optimiser, i.e., the step effective step size at iteration `n` is `eta / (1 + γ * n)` where `eta` is the initial step size. The wrapped optimisers step size is not modified. +Applies inverse time decay to an optimiser, i.e., the effective step size at iteration `n` is `eta / (1 + γ * n)` where `eta` is the initial step size. The wrapped optimiser's step size is not modified. ``` ## Parameters @@ -473,7 +473,7 @@ end """ ExpDecay(eta, decay, decay_step, clip) -Discount the learning rate `eta` by `decay` every `decay_step` till a minimum of `clip`. The wrapped optimisers step size is being modified by the outer optimiser. +Discount the learning rate `eta` by a multiplicative factor `decay` every `decay_step` till a minimum of `clip`. ## Parameters - Learning Rate (eta): Defaults to `0.001`.