-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Make Average Model support for 'moving mean' and 'moving variance' of batch_normal op #9459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1. Add attr 'average' into ParamAttr. 2. Make 'params_grads' optional for AverageModel. 3. Add option 'average_mean' and 'average_variance' for batch_normal.
python/paddle/fluid/layers/nn.py
Outdated
moving_variance_name=None): | ||
moving_variance_name=None, | ||
average_mean=True, | ||
average_variance=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- default by False
- average_mean, average_variance -> do_model_average_for_mean_and_var?
They can share the same flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
python/paddle/fluid/framework.py
Outdated
@@ -1137,6 +1137,8 @@ def __init__(self, block, shape, dtype, **kwargs): | |||
|
|||
self.gradient_clip_attr = kwargs.get('gradient_clip_attr', None) | |||
|
|||
self.average = kwargs.get('average', True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
average -> do_model_ average ?
The meaning of average is not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
python/paddle/fluid/optimizer.py
Outdated
@@ -840,33 +840,45 @@ class ModelAverage(Optimizer): | |||
""" | |||
|
|||
def __init__(self, | |||
params_grads, | |||
average_window_rate, | |||
average_window_rate=0.15, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not set 0.15
as a default value. 0.15 is not general rate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
python/paddle/fluid/param_attr.py
Outdated
@@ -28,13 +28,15 @@ def __init__(self, | |||
learning_rate=1.0, | |||
regularizer=None, | |||
trainable=True, | |||
gradient_clip=None): | |||
gradient_clip=None, | |||
average=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avera=True -> do_model_ average=None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
python/paddle/fluid/param_attr.py
Outdated
self.name = name | ||
self.initializer = initializer | ||
self.learning_rate = learning_rate | ||
self.regularizer = regularizer | ||
self.trainable = trainable | ||
self.gradient_clip = gradient_clip | ||
self.average = average |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
average -> model_average
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
fix #9458