Skip to content

Default binding policy #4799

Closed
Closed
@artpol84

Description

@artpol84

OMPI version: v2.1

I was recently investigating the issue with PMIx_Get latency of a dstore. I was running on 1 node and observing growing numbers when PPN cont was increased. I was using the default binding policy thinking that it defaults to bind-to core.
The bottleneck was attributed to a thread shift part:
openpmix/openpmix#665 (comment).

Debugging the scheduler that PMIx service thread was assigned to a different core which was causing perf issues. You can see on the plot that starting from 4 procs the performance degrades noticeably. This is due to the fact that if IIRC up to 2 processes mpirun will bind to core and then it will be socket.
Perf confirmed that guess:

  • cpu # is enclosed in brackets: [0004];
  • pmix_intra_perf[164802] is the main thread
  • pmix_intra_perf[164807/164802] is a service thread.
$ perf sched timehist
...
  648540.416283 [0004]  pmix_intra_perf[164802]             0.005      0.000      0.005.                                                                                                                                                      
  648540.416289 [0008]  pmix_intra_perf[164807/164802]      0.003      0.000      0.007.                                                                                                                                                      
  648540.416294 [0004]  pmix_intra_perf[164802]             0.004      0.000      0.006.                                                                                                                                                      
  648540.416299 [0008]  pmix_intra_perf[164807/164802]      0.003      0.000      0.006.  
...

For 4 PPN case procs was remaining on their CPUs for the whole time (cpu4 and cpu8). But starting from 16PPN they began to actively migrate which caused more rapid growt:

$ perf sched timehist
...
  649086.369911 [0019]  pmix_intra_perf[165820/165811]      0.004      0.001      0.016.                                                                                                                                                      
  649086.369914 [0017]  pmix_intra_perf[165811]             0.012      0.000      0.006.                                                                                                                                                      
  649086.369921 [0019]  pmix_intra_perf[165820/165811]      0.001      0.000      0.007.                                                                                                                                                      
  649086.369925 [0017]  pmix_intra_perf[165811]             0.005      0.000      0.005.                                                                                                                                                      
  649086.369933 [0019]  pmix_intra_perf[165820/165811]      0.003      0.000      0.008.                                                                                                                                                      
  649086.369941 [0023]  pmix_intra_perf[165811]             0.006      0.000      0.009.                                                                                                                                                      
  649086.369948 [0019]  pmix_intra_perf[165820/165811]      0.006      0.001      0.008.                                                                                                                                                      
  649086.369953 [0023]  pmix_intra_perf[165811]             0.005      0.000      0.006.                                                                                                                                                      
  649086.369961 [0019]  pmix_intra_perf[165820/165811]      0.004      0.001      0.008.                                                                                                                                                      
  649086.369966 [0023]  pmix_intra_perf[165811]             0.005      0.000      0.007.                                                                                                                                                      
  649086.369984 [0019]  pmix_intra_perf[165820/165811]      0.012      0.009      0.010.                                                                                                                                                      
  649086.369994 [0027]  pmix_intra_perf[165811]             0.016      0.001      0.011.                                                                                                                                                      
  649086.369999 [0019]  pmix_intra_perf[165820/165811]      0.008      0.000      0.007.                                                                                                                                                      
  649086.370004 [0027]  pmix_intra_perf[165811]             0.004      0.000      0.006.                                                                                                                                                      
  649086.370012 [0019]  pmix_intra_perf[165820/165811]      0.004      0.000      0.008.                
...

After forcing bind-to core performance stabilized (yellow dashed curve):
openpmix/openpmix#665 (comment)

I this an additional input on the impact that default binding policy may have. The suggestion is to consider this at the next OMPI dev meeting.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions