This repository was archived by the owner on Jan 3, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 811
Parallelism with neon on Nervana Cloud
Arjun Bansal edited this page Mar 9, 2016
·
5 revisions
Currently, neon supports several flavors of data parallelism on the Nervana Cloud. These include:
- multi-GPU within node, synchronous, peer to peer (p2p-sync)
- multi-GPU across nodes, synchronous, peer to peer (p2p-sync)
- multi-GPU across nodes, synchronous, parameter server based (ps-sync)
- multi-GPU across nodes, asynchronous, parameter server based (ps-async)
In addition, Nervana's custom hardware will have much better support for model parallelism compared to GPUs thanks to higher speed interconnects between chips.
- J. Dean, Large Scale Distributed Deep Networks (ps-sync and ps-async)
- N. Strom, Scalable Distributed DNN Using Commodity GPUs (p2p-sync)
- M. Feng, Distributed Deep Learning for Answer Selection (all)
- F. Niu, Hogwild! A Lock-Free approach to parallelizing SGD (ps-async)
- C. De Sa, Taming the Wild: Unified Analysis of Hogwild!-Style algorithms (ps-async)
- S. Zhang, Deep Learning with Elastic Averaging SGD (ps-sync and ps-async)
- O. Yadan, Multi-GPU Training of Convnets (p2p-sync)
- S. Gupta, Model Accuracy and Runtime Tradeoff in Distributed Deep Learning (ps-sync and ps-async)