Description
This is a place holder to fix the performance regressions seen on 2.0.x branch with regards to 1.10 that is impacting MTLs (tested with OFI and PSM2). The degradation is mostly impacting latency in small messages sizes, with some impact in bw.
Building with:
./configure CFLAGS=-O3 --prefix=<install path> --with-libfabric=no
--with-psm2=/usr --disable-oshmem --with-devel-headers --disable-debug
--disable-mem-profile --disable-mem-debug
The below tests assume same system setup, only changing OMPI 1.10 for 2.0.x
Two ranks on different nodes running osu_latency over PSM2.
1 -10%
2 -10%
4 -12%
8 -12%
16 -10%
32 -11%
64 -8%
128 -10%
256 -12%
512 -10%
1024 -12%
2048 -5%
4096 -18%
8192 0%
16384 -6%
32768 -4%
65536 -5%
131072 -3%
262144 -2%
524288 8%
1048576 0%
2097152 0%
4194304 18%
Two ranks on same node running osu_latency over PSM2.
1 -19%
2 -16%
4 -16%
8 -16%
16 -19%
32 -4%
64 -6%
128 -6%
256 -4%
512 -6%
1024 -14%
2048 -31%
4096 -1%
8192 -8%
16384 -24%
32768 -18%
65536 -12%
131072 -6%
262144 -8%
524288 -5%
Two ranks on different nodes running osu_bw over PSM2.
1 -7%
2 -9%
4 -7%
8 -7%
16 -5%
32 -4%
64 -6%
128 -2%
256 -5%
512 -7%
1024 -7%
2048 -5%
4096 -2%
8192 -2%
16384 -1%
32768 2%
65536 0%
131072 1%
262144 0%
524288 0%
1048576 0%
2097152 0%
4194304 0%