How to implement DataParallelEngine

We should support a Net running on multi-GPUs. And users can just define a Net and set GPU ids,  and the parallel running on multi-GPUs will be automatic. 

In caffe2, NCCL and gloo are used to support multi-GPUs on multi-Servers. And both the operations in NCCL and gloo are represented as ```Operator```.

In paddle now, we have implemented MultiGradientMachine and pserver. We might use NCCL to merge gradient in multi-GPUs in our new version. And should we take NCCL operations as ```Operator```?

If NCCL operation is   ```Operator```, then one Net might corresponds to multi-GPUs. Or, just we take NCCL operation as a function, then we will have one Net corresponds to one GPU. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement DataParallelEngine #2749

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to implement DataParallelEngine #2749

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions