I use SyncBN to train mask-rcnn, but have problem when test the model,can you give me some suggesition to use SyncBN

I use SyncBN to train mask-rcnn, I imitate the GN configs and just change the
norm_cfg = dict(type='GN', requires_grad=True) to norm_cfg = dict(type='SyncBN', requires_grad=True), it works well in the train phase.

But when I want to test the model , there is some problems:
`File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/base.py", line 87, in forward
    return self.forward_test(img, img_meta, **kwargs)
  File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/base.py", line 79, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 241, in simple_test
    x = self.extract_feat(img)
  File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 115, in extract_feat
    x = self.backbone(img)
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/backbones/resnet.py", line 509, in forward
    x = self.norm1(x)
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 455, in forward
    world_size = torch.distributed.get_world_size(process_group)
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 584, in get_world_size
    return _get_group_size(group)
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 200, in _get_group_size
    _check_default_pg()
  File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 191, in _check_default_pg
    "Default process group is not initialized"
AssertionError: Default process group is not initialized`

The test code works well when i donnot use SyncBN, can you give me some suggesition about this problem?
Thanks a lot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

I use SyncBN to train mask-rcnn, but have problem when test the model,can you give me some suggesition to use SyncBN #847

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

I use SyncBN to train mask-rcnn, but have problem when test the model,can you give me some suggesition to use SyncBN #847

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions