fix https://github.com/PaddlePaddle/Paddle/issues/3655 Add some survey of optimizer in tensorflow, caffe2 and pytorch after the refactoring, all the computation of paddle will be described by operator, optimizer will also be designed to use operator.