How to do the model parallelism ? #27

vvictor-lee · 2020-03-11T01:06:45Z

Great Work !
I found that the code works well on multiple gpu training but only for the data parallelism.
However, it is hard to train a model with the classes more than 10 million or even more. In that case,
the model parallelism should solve the problem. I was doing the implementations but the diam softmax may seriously interfere the model parallelism. Could you give the solutions or any idea ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to do the model parallelism ? #27

How to do the model parallelism ? #27

vvictor-lee commented Mar 11, 2020

How to do the model parallelism ? #27

How to do the model parallelism ? #27

Comments

vvictor-lee commented Mar 11, 2020