Skip to content

nccl_op_test fails #9367

@luotao1

Description

@luotao1

#9359 only fix typo, but unit test of Teamcity fails:
https://paddleci.ngrok.io/viewLog.html?buildId=31543&buildTypeId=Paddle_PrCi&tab=buildLog&_focus=11520

[11:00:17]	142/215 Test #135: nccl_op_test ....................................***Failed    8.90 sec
[11:00:17]	[==========] Running 4 tests from 1 test case.
[11:00:17]	[----------] Global test environment set-up.
[11:00:17]	[----------] 4 tests from NCCLTester
[11:00:17]	[ RUN      ] NCCLTester.ncclInitOp
[11:00:17]	[       OK ] NCCLTester.ncclInitOp (746 ms)
[11:00:17]	[ RUN      ] NCCLTester.ncclAllReduceOp
[11:00:17]	/paddle/paddle/fluid/operators/nccl_op_test.cu.cc:183: Failure
[11:00:17]	The difference between ct[j] and expected_result is 85, which exceeds 1e-5, where
[11:00:17]	ct[j] evaluates to 0,
[11:00:17]	expected_result evaluates to 85, and
[11:00:17]	1e-5 evaluates to 1.0000000000000001e-05.
[11:00:17]	[  FAILED  ] NCCLTester.ncclAllReduceOp (15 ms)
[11:00:17]	[ RUN      ] NCCLTester.ncclReduceOp
[11:00:17]	[       OK ] NCCLTester.ncclReduceOp (3 ms)
[11:00:17]	[ RUN      ] NCCLTester.ncclBcastOp
[11:00:17]	/paddle/paddle/fluid/operators/nccl_op_test.cu.cc:281: Failure
[11:00:17]	The difference between ct[j] and result is 43, which exceeds 1e-5, where
[11:00:17]	ct[j] evaluates to 85,
[11:00:17]	result evaluates to 42, and
[11:00:17]	1e-5 evaluates to 1.0000000000000001e-05.
[11:00:17]	[  FAILED  ] NCCLTester.ncclBcastOp (5 ms)
[11:00:17]	[----------] 4 tests from NCCLTester (769 ms total)
[11:00:17]	
[11:00:17]	[----------] Global test environment tear-down
[11:00:17]	[==========] 4 tests from 1 test case ran. (769 ms total)
[11:00:17]	[  PASSED  ] 2 tests.
[11:00:17]	[  FAILED  ] 2 tests, listed below:
[11:00:17]	[  FAILED  ] NCCLTester.ncclAllReduceOp
[11:00:17]	[  FAILED  ] NCCLTester.ncclBcastOp
[11:00:17]	
[11:00:17]	 2 FAILED TESTS

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions