Refine code(momentum_op, rmsprop_op ) #6000

chengduoZH · 2017-11-28T12:06:48Z

qingqing01 · 2017-11-28T13:26:17Z

paddle/operators/momentum_op.h

@@ -50,8 +50,7 @@ class MomentumOpKernel : public framework::OpKernel<T> {

    v_out.device(place) = v * mu + g;
    if (use_nesterov) {
-      p_out.device(place) = p - g * lr.broadcast(grad_dsize) +
-                            v_out * mu * lr.broadcast(grad_dsize);
+      p_out.device(place) = p - (g - v_out * mu) * lr.broadcast(grad_dsize);


Whether the framework::EigenScalar can be used for lr instead of broadcast ?

I've tried it, but the result is wrong.

dzhwinter · 2017-11-28T13:31:16Z

paddle/operators/momentum_op.h

@@ -50,8 +50,7 @@ class MomentumOpKernel : public framework::OpKernel<T> {

    v_out.device(place) = v * mu + g;
    if (use_nesterov) {
-      p_out.device(place) = p - g * lr.broadcast(grad_dsize) +
-                            v_out * mu * lr.broadcast(grad_dsize);
+      p_out.device(place) = p - (g - v_out * mu) * lr.broadcast(grad_dsize);


This implement is almost same.
it's better to remove the v_out.device

Great, I think it's better to remove the v_out.device, too.

I find that v_out and v point to the same memory, this is to say the writer use inplace method in python interface. And it is necessary to update v_out in momentum_op, so v_out.device can't be removed.

chengduoZH · 2017-11-28T15:20:37Z

I find that the implementation of momentum_op seems to be wrong.
Document description:

velocity = mu * velocity + gradient 
if (use\_nesterov):   
  param = param - gradient * learning\_rate + mu * velocity * learning\_rate 
else:   
  param = param - learning\_rate * velocity.

Code implementation:

Paddle/paddle/operators/momentum_op.h

Lines 51 to 57 in 728e8b1

    
           v_out.device(place) = v * mu + g; 
        
           if (use_nesterov) { 
        
             p_out.device(place) = p - g * lr.broadcast(grad_dsize) + 
        
                                   v_out * mu * lr.broadcast(grad_dsize); 
        
           } else { 
        
             p_out.device(place) = p - lr.broadcast(grad_dsize) * v_out; 
        
           }

chengduoZH · 2017-11-29T06:18:41Z

The writer use inplace method in python interface, this is to say that v_out and v point to the same memory. I think this should be annotated in the code.

Paddle/python/paddle/v2/fluid/optimizer.py

Lines 265 to 278 in 21053c1

    
           momentum_op = block.append_op( 
        
               type=self.type, 
        
               inputs={ 
        
                   "Param": param_and_grad[0], 
        
                   "Grad": param_and_grad[1], 
        
                   "Velocity": velocity_acc, 
        
                   "LearningRate": self._create_param_lr(param_and_grad) 
        
               }, 
        
               outputs={ 
        
                   "ParamOut": param_and_grad[0], 
        
                   "VelocityOut": velocity_acc 
        
               }, 
        
               attrs={"mu": self._momentum, 
        
                      "use_nesterov": self._use_nesterov})

refine code

3c0a81e

chengduoZH requested review from dzhwinter and qingqing01 November 28, 2017 12:06

qingqing01 reviewed Nov 28, 2017

View reviewed changes

dzhwinter reviewed Nov 28, 2017

View reviewed changes

refine doc

2d99481

chengduoZH closed this Dec 5, 2017

chengduoZH deleted the refine_code branch January 29, 2018 07:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine code(momentum_op, rmsprop_op ) #6000

Refine code(momentum_op, rmsprop_op ) #6000

chengduoZH commented Nov 28, 2017 •

edited

Loading

qingqing01 Nov 28, 2017

chengduoZH Nov 28, 2017

dzhwinter Nov 28, 2017 •

edited

Loading

chengduoZH Nov 28, 2017 •

edited

Loading

chengduoZH Nov 29, 2017

chengduoZH commented Nov 28, 2017

chengduoZH commented Nov 29, 2017

Refine code(momentum_op, rmsprop_op ) #6000

Refine code(momentum_op, rmsprop_op ) #6000

Conversation

chengduoZH commented Nov 28, 2017 • edited Loading

qingqing01 Nov 28, 2017

Choose a reason for hiding this comment

chengduoZH Nov 28, 2017

Choose a reason for hiding this comment

dzhwinter Nov 28, 2017 • edited Loading

Choose a reason for hiding this comment

chengduoZH Nov 28, 2017 • edited Loading

Choose a reason for hiding this comment

chengduoZH Nov 29, 2017

Choose a reason for hiding this comment

chengduoZH commented Nov 28, 2017

chengduoZH commented Nov 29, 2017

chengduoZH commented Nov 28, 2017 •

edited

Loading

dzhwinter Nov 28, 2017 •

edited

Loading

chengduoZH Nov 28, 2017 •

edited

Loading