Skip to content

Change default value of eps in FrozenBatchNorm to match BatchNorm #2599

Closed
@juyunsang

Description

@juyunsang

❓ Questions and Help

Hello
Loss is nan error occurs when I learn fast rcnn with resnext101 backbone
My code is as follows

backbone = resnet_fpn_backbone('resnext101_32x8d', pretrained=True)
model = FasterRCNN(backbone, num_classes)
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

error message

Epoch: [0]  [   0/7208]  eta: 1:27:42  lr: 0.000040  loss: 40613806080.0000 (40613806080.0000)  loss_box_reg: 7979147264.0000 (7979147264.0000)  loss_classifier: 11993160704.0000 (11993160704.0000)  loss_objectness: 9486380032.0000 (9486380032.0000)  loss_rpn_box_reg: 11155118080.0000 (11155118080.0000)  time: 0.7301  data: 0.4106  max mem: 1241
Loss is nan, stopping training

When i change the backbone to resnet50 and resnet152, no error occrus.

Please note that this issue tracker is not a help form and this issue will be closed.

We have a set of listed resources available on the website. Our primary means of support is our discussion forum:

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions