-
Notifications
You must be signed in to change notification settings - Fork 166
Closed
Labels
bugSomething isn't working correctlySomething isn't working correctly
Description
What happened?
I tried to turn on the threading option in a CAM simulation (F2000climo compset, ne30pg3 resolution). I used one compute node on Derecho with 64 MPI tasks and 2 threads per MPI task. It built successfully but I encountered lots of runtime errors (partials of them are listed below):
dec2481.hsn.de.hpc.ucar.edu 16: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 43: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 51: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 44: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 62: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 39: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 40: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 45: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 15: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 24: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 17: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 59: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 42: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 57: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 18: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 27: munmap_chunk(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 32: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 33: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 35: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 38: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 55: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 50: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 54: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 2: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 8: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 10: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 13: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 19: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 23: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 41: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 36: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 12: free(): invalid pointer
dec2481.hsn.de.hpc.ucar.edu 6: forrtl: error (76): Abort trap signal
dec2481.hsn.de.hpc.ucar.edu 6: Image PC Routine Line Source
dec2481.hsn.de.hpc.ucar.edu 6: libpthread-2.31.s 000014A3E75D48C0 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libc-2.31.so 000014A3E2BEBCBB gsignal Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libc-2.31.so 000014A3E2BED355 abort Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libc-2.31.so 000014A3E2C31AE7 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libc-2.31.so 000014A3E2C39B6A Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libc-2.31.so 000014A3E2C3B614 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: cesm.exe 000000000112BFFB fvm_consistent_se 163 fvm_consistent_se_cslam.F90
dec2481.hsn.de.hpc.ucar.edu 6: libiomp5.so 000014A3E30F6053 __kmp_invoke_micr Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libiomp5.so 000014A3E30642F3 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libiomp5.so 000014A3E3063232 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libiomp5.so 000014A3E30F6DC1 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libpthread-2.31.s 000014A3E75C86EA Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 6: libc-2.31.so 000014A3E2CB8A6F clone Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: forrtl: error (76): Abort trap signal
dec2481.hsn.de.hpc.ucar.edu 29: Image PC Routine Line Source
dec2481.hsn.de.hpc.ucar.edu 29: libpthread-2.31.s 000014C84D0B88C0 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libc-2.31.so 000014C8486CFCBB gsignal Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libc-2.31.so 000014C8486D1355 abort Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libc-2.31.so 000014C848715AE7 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libc-2.31.so 000014C84871DB6A Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libc-2.31.so 000014C84871F614 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: cesm.exe 000000000112BFFB fvm_consistent_se 163 fvm_consistent_se_cslam.F90
dec2481.hsn.de.hpc.ucar.edu 29: libiomp5.so 000014C848BDA053 __kmp_invoke_micr Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libiomp5.so 000014C848B482F3 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libiomp5.so 000014C848B47232 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libiomp5.so 000014C848BDADC1 Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libpthread-2.31.s 000014C84D0AC6EA Unknown Unknown Unknown
dec2481.hsn.de.hpc.ucar.edu 29: libc-2.31.so 000014C84879CA6F clone Unknown Unknown
The complete list of errors could be found on Derecho at /glade/derecho/scratch/sunjian/cam6_run/F2000climo.ne30pg3_ne30pg3_mg17.derecho.intel.gpu00_pcols00016_mpi0064_thread002_rrtmgp/run/cesm.log.2648024.desched1.231212-143239.
What are the steps to reproduce the bug?
To reproduce the error on Derecho, you can do:
- ./create_newcase --case /glade/derecho/scratch/sunjian/cam6/F2000climo.ne30pg3_ne30pg3_mg17.derecho.intel --mach derecho --res ne30pg3_ne30pg3_mg17 --compset F2000climo --compiler intel
- cd /glade/derecho/scratch/sunjian/cam6/F2000climo.ne30pg3_ne30pg3_mg17.derecho.intel
- ./xmlchange --file env_mach_pes.xml --id NTASKS --val 64
- ./xmlchange --file env_mach_pes.xml --id NTHRDS --val 2
- ./case.setup
- ./case.build
- ./case.submit
What CAM tag were you using?
cam6_3_139
What machine were you running CAM on?
CISL machine (e.g. cheyenne)
What compiler were you using?
Intel
Path to a case directory, if applicable
/glade/derecho/scratch/sunjian/cam6/F2000climo.ne30pg3_ne30pg3_mg17.derecho.intel.gpu00_pcols00016_mpi0064_thread002_rrtmgp
Will you be addressing this bug yourself?
No
Extra info
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't working correctlySomething isn't working correctly
Type
Projects
Status
Done