Skip to content

Auto tuning related limitations

Hüseyin Tuğrul BÜYÜKIŞIK edited this page Feb 8, 2021 · 6 revisions

Currently, there are only two auto-tuning related parameters for constructor of VirtualMultiArray. memMult and mem.

If memMult parameter is given output of benchmarking:

PcieBandwidthBenchmarker bench;

VirtualMultiArray<Obj> data(... bench.bestBandwidth(2),VirtualMultiArray<Obj>::MemMult::UseDefault);

it reduces combined capacity for the virtual array. Similarly, when optimized for maximum-capacity, the overall bandwidth drops. Even the default distribution (with last two parameters not given by user) does not handle any overflow/slow cases.

Virtual Array = 10 GB

Card                   pci-e bridge spec   maximized bandwidth   maximized capacity   default
GTX1080ti(11GB-vram)   2.0 16x             8GB                   8.5GB                5GB
GT1030   (2GB-vram)    2.0 4x              2GB (overflow?)       1.5GB                5GB (error)
combined capacity                          error                 10GB                 error

Virtual Array = 4 GB

Card                   pci-e bridge spec   maximized bandwidth   maximized capacity    default
GTX1080ti(11GB-vram)   2.0 16x             4GB/s                 3.3GB/s               1GB/s
GT1030   (2GB-vram)    2.0 4x              1GB/s                 0.7GB/s               1GB/s (possible overflow!)
combined bandwidth                         5GB/s                 4GB/s                 2GB/s

Efficiency of symmetrical or optimized gpu setups (bigger card plugged to faster bridge instead) may not be affected much.