fluid support asynchronous training

### Project 
https://github.com/PaddlePaddle/Paddle/projects/61

### Design
- [x] Add async update design doc. https://github.com/PaddlePaddle/Paddle/pull/9932
- [x] Add distributed training overview doc. https://github.com/PaddlePaddle/Paddle/pull/9937

### Operators
- [x] VariableResponse support deserialize var into local scope. #10060
- [x] Refine listen and serve op, Separate RunSyncLoop to a method, prepare for RunAsyncLoop. #10080
- [x] split optimization ops on pserver to independenty blocks #10123
- [x] Create sub socpe when it is necessary #10124
- [x] Add an RunAsyncUpdate(no barrier and no lock) to listen_and_serv_op #9997
   - Prepare optimization block and PrepareContext for each parameter.
   - Add BlockQueue for each parameter block. The queue is used to store the gradient VariableMessage of this parameter from trainers.
  - Add a thread for each parameter to run optimization block.
  - The thread will read gradient from its BlockQueue, create a subscope to deserialize it and then use this subscope to run optimization block.
  - Add one thread to get parameter from the global scope for trainers.(Maybe we need a thread pool to speed up the get process. but it seems that GRPC interface can only work in one thread. Can have a test)
- [x] send_vars and read_vars from pserver without send_barrier and get_barrier.
- [x] Use multi thread todo update #10228

### Transpiler #9997
- [x] dist transpile async trainer program. Do not need to add `.trainer_n` suffix to gradient block in async mode.
- [x] dist transpile async pserver program. Do not need to aggregate gradient block.


### Consider
- [ ] need to consider how to add learning rate decay in asynchronous training. Do we need lr_decay?

### Benchmark
- [x] benchmark of fluid async training #10180

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fluid support asynchronous training #9941

Project

Design

Operators

Transpiler #9997

Consider

Benchmark

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fluid support asynchronous training #9941

Description

Project

Design

Operators

Transpiler #9997

Consider

Benchmark

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions