Skip to content

Conversation

panyx0718
Copy link
Contributor

@panyx0718 panyx0718 commented Apr 1, 2018

se-resnext on 4 device titan-x goes from ~1.25 to ~0.98

Some changes in the PR are due to pre-commit cpplint

The main issue is that nccl_all_reduce stream cannot overlap with computation streams.
some slow nccl_all_reduce block other gpus' streams
slow_nccl

@panyx0718 panyx0718 force-pushed the group_nccl_all_reduce branch 2 times, most recently from 37257cf to 836f069 Compare April 1, 2018 09:44
@panyx0718 panyx0718 requested review from chengduoZH and reyoung and removed request for reyoung April 1, 2018 10:12
@panyx0718 panyx0718 force-pushed the group_nccl_all_reduce branch 2 times, most recently from 114a6b2 to 4a76d1d Compare April 2, 2018 08:10
chengduoZH
chengduoZH previously approved these changes Apr 2, 2018
Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@panyx0718 panyx0718 force-pushed the group_nccl_all_reduce branch 4 times, most recently from be193ab to cb5d752 Compare April 3, 2018 00:38
chengduoZH
chengduoZH previously approved these changes Apr 3, 2018
Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@panyx0718 panyx0718 merged commit 49313d4 into PaddlePaddle:develop Apr 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants