Optimizer C_API 

上次讨论提到

Model Optimization Using Gradients

>There are two ways to perform model optimization using gradients:

>- On Client
>  The client does multiple steps of forward and backward update. In each step, the gradients are >calculated and a new model is generated. After some steps, the client will calculate the difference >between the newest model and the old model at step 0. The difference will be updated to >parameter servers. Parameter servers will just update parameters using the difference without any >optimization using gradients (such as Adam and L1 regularization).
>- On Parameter Server
>  The client will send accumulated gradients to parameter servers, the parameter server will do the >optimization using gradients.

这两种更新参数的方法计划都支持。目前v1版本只支持1(On Client的方法)，由于两种方法都需要Optimizer更新的策略，因此选择将Optimizer封装成一个库。



ParameterServer为Go语言实现，需要一个Optimizer的C接口，定义如下
```c++
    // support data type same with @helin's client design doc, 
    typedef enum {
      PADDLE_ELEMENT_TYPE_INT32   = 0,
      PADDLE_ELEMENT_TYPE_UINT32  = 1,
      PADDLE_ELEMENT_TYPE_INT64   = 2,
      PADDLE_ELEMENT_TYPE_UINT64  = 3,
      PADDLE_ELEMENT_TYPE_FLOAT32 = 4,
      PADDLE_ELEMENT_TYPE_FLOAT64 = 5,
    } paddle_element_type;
    
    
    
    /*
    @brief update interface of optimizer, which will be used in 
    Trainer process
    ParameterServer process to support On ParameterServer optimize
    @param buffer : array of parameters
    @param datatype : datatype of parameter and gradient
    @param optimizer_name: optimizer_name as algorithm id, "SGD, Adam"
    @param gradient : array of gradients, which will be apply to parameters
    */
    void updateParameter(void *buffer, paddle_element_type datatype, const char* optimizer_name, const void* gradient);

```

1、是否能将sparseUpdate和denseUpdate都使用这一个接口？

SparseUpdate存储为SparseRowMatrix，可以复用这个接口。



2、是否能将Regularizer一起封装在这个库里？

On Client的参数更新方式已经和通信耦合，特别是SparseUpdate时候，由于Update的过程是lazy的，本地迭代了多次，Regularizer需要保存计算的轮数，并且需要在某次读取时候触发更新。没想到合适的办法拆分通信状态



Optimizer计划封装math库里的底层applySGD等操作，详细代码位置见：

https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/math/TrainingAlgorithmOp.cu#L25

后期接入Majel可以迁移这部分代码


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer C_API #2168

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimizer C_API #2168

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions