Skip to content

Commit

Permalink
Better documentation and warning (pytorch#13946)
Browse files Browse the repository at this point in the history
Summary:
This is to address pytorch#12603
Pull Request resolved: pytorch#13946

Differential Revision: D13055254

Pulled By: teng-li

fbshipit-source-id: 20a206ebd3456eac9dc50584664c4bca3ee955d1
  • Loading branch information
teng-li authored and facebook-github-bot committed Nov 14, 2018
1 parent 143ba72 commit 4983397
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions torch/nn/parallel/distributed.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,17 @@ class DistributedDataParallel(Module):
won't be invoked anymore, unless the hooks are initialized in the
:meth:`forward` method.
.. warning::
You should never try to change your model's parameters after wrapping
up your model with DistributedDataParallel. In other words, when
wrapping up your model with DistributedDataParallel, the constructor of
DistributedDataParallel will register the additional gradient
reduction functions on all the parameters of the model itself at the
time of construction. If you change the model's parameters after
the DistributedDataParallel construction, this is not supported and
unexpected behaviors can happen, since some parameters' gradient
reduction functions might not get called.
.. note::
Parameters are never broadcast between processes. The module performs
an all-reduce step on gradients and assumes that they will be modified
Expand Down

0 comments on commit 4983397

Please sign in to comment.