Skip to content

test_memopt_fit_a_line fails sometimes #8754

Closed
@luotao1

Description

@luotao1

test_memopt_fit_a_line fails sometimes in TeamCity:

[20:22:10] :	 [Step 1/1] 210/212 Test #212: test_memopt_fit_a_line ..........................***Exception: SegFault  4.48 sec
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 0, var name is tmp_0, cached var name is mean_1.tmp_0@GRAD, var shape is [1L] 
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 0, var name is tmp_2, cached var name is fc_0.b_0@GRAD, var shape is [1L] 
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 0, var name is tmp_3, cached var name is tmp_1, var shape is [1L] 
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 1, var name is y@GRAD, cached var name is square_error_cost_0.tmp_1@GRAD, var shape is [-1L, 1L] 
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 1, var name is fc_0.tmp_1@GRAD, cached var name is square_error_cost_0.tmp_1, var shape is [-1L, 1L] 
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 1, var name is fc_0.tmp_0@GRAD, cached var name is y, var shape is [-1L, 1L] 
[20:22:10] :	 [Step 1/1] Hit Cache !!!! cache pool index is 0, var name is fc_0.b_0@GRAD__nccl_all_reduce__, cached var name is mean_0.tmp_0@GRAD, var shape is [1L] 
[20:22:10] :	 [Step 1/1] Unexpected end of /proc/mounts line `overlay / overlay rw,relatime,lowerdir=/var/local/osd0/docker/overlay2/l/57UUJRTVYZZJSGEPFQTZIQT566:/var/local/osd0/docker/overlay2/l/4GTILHVASP77BF3RBT5SX56QVM:/var/local/osd0/docker/overlay2/l/S2BPF2AZ4RTVXCDUEOJAEHE3OU:/var/local/osd0/docker/overlay2/l/T5XD5VFB7I27WNADWAUUE37YOJ:/var/local/osd0/docker/overlay2/l/X2PYK5OI4RNGYCPU3466VDUQUE:/var/local/osd0/docker/overlay2/l/OWW3JCBEPHGSYF5B5IYGERLVZR:/var/local/osd0/docker/overlay2/l/FYPR6LFDNWX2Y6OAHIZSG4RYDJ:/var/local/osd0/docker/overlay2/l/4NEBWKXEANB'
[20:22:10] :	 [Step 1/1] Unexpected end of /proc/mounts line `7RJYJ5CT67CIWWN:/var/local/osd0/docker/overlay2/l/5J7FEOMK62OCVTOVC5WCHP3GUJ:/var/local/osd0/docker/overlay2/l/QXC4XLCM6NCWHTYT3GXQOM3JL7:/var/local/osd0/docker/overlay2/l/637Q3XP6EF7Z2XM3WFRVKI4D7M:/var/local/osd0/docker/overlay2/l/2CLIZII3MP36UCBCAUYSB6Q2PY:/var/local/osd0/docker/overlay2/l/G7Y4IOQ53HYUYIN4ZXFJCD3XZJ:/var/local/osd0/docker/overlay2/l/U4NJ4AYO2MMWSKYOQIYRGBDP2F:/var/local/osd0/docker/overlay2/l/ZSN3FHLUTFGQLLPDXA3V4YEZUU:/var/local/osd0/docker/overlay2/l/BDENEWPLH4KXFWMYCADAVYVVB2:/var/lo'
[20:22:10] :	 [Step 1/1] Unexpected end of /proc/mounts line `cal/osd0/docker/overlay2/l/CGTT62IBWXDUQYINZ42QNYTON2:/var/local/osd0/docker/overlay2/l/OXGGPNLIB6G4FG4FMPHRKO4KRI:/var/local/osd0/docker/overlay2/l/JAJBAWPNEO3NGU3QBPPLQDFQD6:/var/local/osd0/docker/overlay2/l/CUDU55SQDNKM3JEXH6FQMC4RVA:/var/local/osd0/docker/overlay2/l/HMVNJ6MXZNSTAPURADYRHKS7G2:/var/local/osd0/docker/overlay2/l/6BHH3DD42JPHKCBISOCGZ6O4IU:/var/local/osd0/docker/overlay2/l/HTKK6BPZ2DVGNTVZIC4TNXGMSS:/var/local/osd0/docker/overlay2/l/VDAQQFSLU34AYOONXFMWC3ZBVR:/var/local/osd0/docker/overlay'
[20:22:10] :	 [Step 1/1] Unexpected end of /proc/mounts line `2/l/5WOXCTHWCLFPB2LMPN543BNON3:/var/local/osd0/docker/overlay2/l/5SQBC2PC3M7AXGZONOXQN2ZXEI:/var/local/osd0/docker/overlay2/l/6HPK6CT4QNWRQFB6RHM6DZ3B4T:/var/local/osd0/docker/overlay2/l/AYKRGLPOTKQ3HG2OCQUUNXXJDV:/var/local/osd0/docker/overlay2/l/JMTOMIQFIBBS65D5AQA4AYXZ2J:/var/local/osd0/docker/overlay2/l/AMBVE24EBLT3UNYZN7QKDEXQEX,upperdir=/var/local/osd0/docker/overlay2/fcdab3ea871d6a5179eca7471e27a5477eb18804f7b443b30642e6cc260ce18f/diff,workdir=/var/local/osd0/docker/overlay2/fcdab3ea871d6a5179eca747'
[20:22:10] :	 [Step 1/1] 603.817
[20:22:10] :	 [Step 1/1] 726.9469
[20:22:10] :	 [Step 1/1] 67.22336
[20:22:10] :	 [Step 1/1] 551.19086
[20:22:10] :	 [Step 1/1] 671.1267
[20:22:10] :	 [Step 1/1] 52.88642
[20:22:10] :	 [Step 1/1] 503.82404
[20:22:10] :	 [Step 1/1] *** Aborted at 1520252529 (unix time) try "date -d @1520252529" if you are using GNU date ***
[20:22:10] :	 [Step 1/1] PC: @                0x0 (unknown)
[20:22:10] :	 [Step 1/1] *** SIGSEGV (@0x30) received by PID 14642 (TID 0x7f495ef1e700) from PID 48; stack trace: ***
[20:22:10] :	 [Step 1/1]     @     0x7f495eaf9390 (unknown)
[20:22:10] :	 [Step 1/1]     @     0x7f480fe01078 (unknown)
[20:22:10] :	 [Step 1/1]     @     0x7f480fe05405 ncclCommInitAll
[20:22:10] :	 [Step 1/1]     @     0x7f4919802d04 paddle::operators::NCCLInitOp::RunImpl()
[20:22:10] :	 [Step 1/1]     @     0x7f4918da502a paddle::framework::Executor::Run()
[20:22:10] :	 [Step 1/1]     @     0x7f4918d1c993 _ZZN8pybind1112cpp_function10initializeIZNS0_C4IvN6paddle9framework8ExecutorEIRKNS4_11ProgramDescEPNS4_5ScopeEibbEINS_4nameENS_9is_methodENS_7siblingEEEEMT0_FT_DpT1_EDpRKT2_EUlPS5_S8_SA_ibbE_vISO_S8_SA_ibbEISB_SC_SD_EEEvOSF_PFSE_SH_ESN_ENUlRNS_6detail13function_callEE1_4_FUNESV_
[20:22:10] :	 [Step 1/1]     @     0x7f4918d1a6d4 pybind11::cpp_function::dispatcher()
[20:22:10] :	 [Step 1/1]     @           0x4c37ed PyEval_EvalFrameEx
[20:22:10] :	 [Step 1/1]     @           0x4b9ab6 PyEval_EvalCodeEx
[20:22:10] :	 [Step 1/1]     @           0x4c16e7 PyEval_EvalFrameEx
[20:22:10] :	 [Step 1/1]     @           0x4b9ab6 PyEval_EvalCodeEx
[20:22:10] :	 [Step 1/1]     @           0x4eb30f (unknown)
[20:22:10] :	 [Step 1/1]     @           0x4e5422 PyRun_FileExFlags
[20:22:10] :	 [Step 1/1]     @           0x4e3cd6 PyRun_SimpleFileExFlags
[20:22:10] :	 [Step 1/1]     @           0x493ae2 Py_Main
[20:22:10] :	 [Step 1/1]     @     0x7f495e73e830 __libc_start_main
[20:22:10] :	 [Step 1/1]     @           0x4933e9 _start
[20:22:10] :	 [Step 1/1]     @                0x0 (unknown)
[20:22:10] :	 [Step 1/1] 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions