Skip to content

master: MPI_Gather BTL hang (ob1 problem) #4795

Closed
@jsquyres

Description

@jsquyres

Git bisect shows that 409638b from @bosilca and @thananon is the first bad commit (it changed how ob1 handles out-of-order receives) that is causing MPI_Gather() in IMB to hang for me with both the TCP and usNIC BTLs.

This is 100% reproducible for me. When I run IMB across 2 servers (with ppn=16), it will hang in Gather -- I'm pretty sure it hangs once we transition into the long protocol (i.e., after the 64K results are shown for TCP and after the 16K results are shown for usNIC):

$ mpirun --mca btl usnic,vader,self IMB-MPI1 -npmin 32 Gather
 benchmarks to run Gather 
#---------------------------------------------------
#    Intel (R) MPI Benchmark Suite V3.2.4, MPI-1 part    
#---------------------------------------------------
...
#----------------------------------------------------------------
# Benchmarking Gather 
# #processes = 32 
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.04         0.04         0.04
            1         1000        14.78        14.83        14.80
            2         1000        14.96        15.01        14.99
            4         1000        15.05        15.12        15.09
            8         1000        15.32        15.38        15.35
           16         1000        15.65        15.71        15.68
           32         1000        16.18        16.24        16.21
           64         1000        18.18        18.24        18.21
          128         1000        20.81        20.87        20.84
          256         1000        24.71        24.80        24.76
          512         1000        34.46        34.62        34.51
         1024         1000        13.87        14.19        14.04
         2048         1000        17.38        17.83        17.62
         4096         1000        49.83        50.23        50.02
         8192         1000       269.86       270.38       270.16
        16384         1000       315.06       315.69       315.43
<hang>
$ mpirun --mca btl tcp,vader,self IMB-MPI1 -npmin 32 Gather
 benchmarks to run Gather 
#---------------------------------------------------
#    Intel (R) MPI Benchmark Suite V3.2.4, MPI-1 part    
#---------------------------------------------------
...
#----------------------------------------------------------------
# Benchmarking Gather 
# #processes = 32 
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.04         0.07         0.04
            1         1000        46.73        46.90        46.82
            2         1000        46.63        46.80        46.72
            4         1000        46.98        47.16        47.06
            8         1000        48.44        48.61        48.52
           16         1000        51.35        51.57        51.46
           32         1000        53.16        53.43        53.33
           64         1000        55.42        55.66        55.54
          128         1000        59.01        59.20        59.10
          256         1000        65.72        65.96        65.83
          512         1000        79.08        79.52        79.26
         1024         1000        67.23        68.24        67.73
         2048         1000        73.87        75.06        74.50
         4096         1000       113.16       114.54       114.16
         8192         1000      1018.50      1020.58      1019.66
        16384         1000      1039.11      1041.27      1040.34
        32768         1000      1285.78      1288.46      1287.30
        65536          640      1881.22      1884.78      1883.41
<hang>

Note that I have 3 usNIC interfaces and 4 IP interfaces. Hence, receiving frags out of order is highly likely. This might be necessary to reproduce the issue...?

Also note that this is only happening on master -- I checked the timeline: 409638b was committed to master after v3.1 branched, and was not PR'ed over.

@bosilca @thananon What additional information can I get to you to help diagnose what is going wrong?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions