Add LARS support #10374

typhoonzero · 2018-05-03T08:43:21Z

Related: #7788

To use, add LARS_weight_decay=[some value greater than 0] to enable LARS, LARS can also works along with current learning rate schedulers, like "polynomial_decay" e.g.

opt = fld.optimizer.Momentum(
  learning_rate=layers.polynomial_decay(learning_rate=1.0, decay_steps=100, power=2.0),
  momentum=0.8,
  LARS_weight_decay=0.3)

or

opt = fld.optimizer.Momentum(
  learning_rate=1.0,
  momentum=0.8,
  LARS_weight_decay=0.3)

jacquesqiao · 2018-05-04T07:11:01Z

python/paddle/fluid/layers/learning_rate_scheduler.py

+
+
+def append_LARS(params_grads, learning_rate, weight_decay):
+    """Applies LARS (LAYER-WISE ADAPTIVE RATE SCALING) to learning rate for


Maybe we can add the link to the paper here.

jacquesqiao · 2018-05-04T09:36:22Z

Please add a unit test for this learning rate scheduler strategy.

jacquesqiao · 2018-05-04T10:09:32Z

python/paddle/fluid/optimizer.py

+    def __init__(self,
+                 learning_rate,
+                 regularization=None,
+                 LARS_weight_decay=0.0):


According to the paper, I think the default value of LARS_weight_decay may be 1.0.

typhoonzero · 2018-05-04T10:32:35Z

The program can not transpile to distributed version correctly when using LARS, still debugging.

… lars_scheduler

typhoonzero · 2018-06-15T04:03:54Z

@jacquesqiao Can we merge this for now, so I can test it using NCCL2 distributed training

jacquesqiao

LGTM!

add LARS support

e28e0bb

typhoonzero requested review from reyoung and Yancey0623 May 3, 2018 08:49

typhoonzero changed the title ~~add LARS support~~ Add LARS support May 3, 2018

typhoonzero requested a review from jacquesqiao May 3, 2018 12:56

jacquesqiao reviewed May 4, 2018

View reviewed changes

jacquesqiao closed this May 4, 2018

jacquesqiao reopened this May 4, 2018

jacquesqiao reviewed May 4, 2018

View reviewed changes

yi.wu added 2 commits June 14, 2018 12:50

update

cd2e693

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7c3e843

… lars_scheduler

jacquesqiao approved these changes Jun 15, 2018

View reviewed changes

typhoonzero merged commit 53d1d0f into PaddlePaddle:develop Jun 15, 2018

luotao1 mentioned this pull request Jun 19, 2018

Add LARS to SGD and Momentum Optimizers (#6811) #7788

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LARS support #10374

Add LARS support #10374

typhoonzero commented May 3, 2018 •

edited

Loading

jacquesqiao May 4, 2018

jacquesqiao commented May 4, 2018

jacquesqiao May 4, 2018

typhoonzero commented May 4, 2018

typhoonzero commented Jun 15, 2018

jacquesqiao left a comment



		def append_LARS(params_grads, learning_rate, weight_decay):
		"""Applies LARS (LAYER-WISE ADAPTIVE RATE SCALING) to learning rate for

Add LARS support #10374

Add LARS support #10374

Conversation

typhoonzero commented May 3, 2018 • edited Loading

jacquesqiao May 4, 2018

Choose a reason for hiding this comment

jacquesqiao commented May 4, 2018

jacquesqiao May 4, 2018

Choose a reason for hiding this comment

typhoonzero commented May 4, 2018

typhoonzero commented Jun 15, 2018

jacquesqiao left a comment

Choose a reason for hiding this comment

typhoonzero commented May 3, 2018 •

edited

Loading