Closed
Description
上次讨论提到
Model Optimization Using Gradients
There are two ways to perform model optimization using gradients:
- On Client
The client does multiple steps of forward and backward update. In each step, the gradients are >calculated and a new model is generated. After some steps, the client will calculate the difference >between the newest model and the old model at step 0. The difference will be updated to >parameter servers. Parameter servers will just update parameters using the difference without any >optimization using gradients (such as Adam and L1 regularization).- On Parameter Server
The client will send accumulated gradients to parameter servers, the parameter server will do the >optimization using gradients.
这两种更新参数的方法计划都支持。目前v1版本只支持1(On Client的方法),由于两种方法都需要Optimizer更新的策略,因此选择将Optimizer封装成一个库。
ParameterServer为Go语言实现,需要一个Optimizer的C接口,定义如下
// support data type same with @helin's client design doc,
typedef enum {
PADDLE_ELEMENT_TYPE_INT32 = 0,
PADDLE_ELEMENT_TYPE_UINT32 = 1,
PADDLE_ELEMENT_TYPE_INT64 = 2,
PADDLE_ELEMENT_TYPE_UINT64 = 3,
PADDLE_ELEMENT_TYPE_FLOAT32 = 4,
PADDLE_ELEMENT_TYPE_FLOAT64 = 5,
} paddle_element_type;
/*
@brief update interface of optimizer, which will be used in
Trainer process
ParameterServer process to support On ParameterServer optimize
@param buffer : array of parameters
@param datatype : datatype of parameter and gradient
@param optimizer_name: optimizer_name as algorithm id, "SGD, Adam"
@param gradient : array of gradients, which will be apply to parameters
*/
void updateParameter(void *buffer, paddle_element_type datatype, const char* optimizer_name, const void* gradient);
1、是否能将sparseUpdate和denseUpdate都使用这一个接口?
SparseUpdate存储为SparseRowMatrix,可以复用这个接口。
2、是否能将Regularizer一起封装在这个库里?
On Client的参数更新方式已经和通信耦合,特别是SparseUpdate时候,由于Update的过程是lazy的,本地迭代了多次,Regularizer需要保存计算的轮数,并且需要在某次读取时候触发更新。没想到合适的办法拆分通信状态
Optimizer计划封装math库里的底层applySGD等操作,详细代码位置见:
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/math/TrainingAlgorithmOp.cu#L25
后期接入Majel可以迁移这部分代码
Metadata
Metadata
Assignees
Labels
No labels