Skip to content

Incorrect Inference.infer results when running with multiple GPUs. #3073

@xinghai-sun

Description

@xinghai-sun

Inference.infer will have incorrect results when running with multiple GPUs.

Below is the output data of the first convolution layer for 4 example instances, when trainer_count=1 (upper figure) and trainer_count=2 (lower figure).

screen shot 2017-07-26 at 6 27 09 pm

screen shot 2017-07-26 at 6 27 48 pm

The output results are wrong if trainer_count > 1 (use_gpu=True).

If we just print the input data layer, no difference is found between the two cases, indicating that the problem might exist in models instead of data allocation across GPUs.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions