-
Notifications
You must be signed in to change notification settings - Fork 3
Auto tuning related limitations
Hüseyin Tuğrul BÜYÜKIŞIK edited this page Feb 8, 2021
·
6 revisions
Currently, there are only two auto-tuning related parameters for constructor of VirtualMultiArray. memMult
and mem
.
If memMult
parameter is given output of benchmarking:
PcieBandwidthBenchmarker bench;
VirtualMultiArray<Obj> data(... bench.bestBandwidth(2),VirtualMultiArray<Obj>::MemMult::UseDefault);
it reduces combined capacity for the virtual array. Similarly, when optimized for maximum-capacity, the overall bandwidth drops. Even the default distribution (with last two parameters not given by user) does not handle any overflow/slow cases.
Virtual Array = 10 GB
Card pci-e bridge spec maximized bandwidth maximized capacity default
GTX1080ti(11GB-vram) 2.0 16x 8GB 8.5GB 5GB
GT1030 (2GB-vram) 2.0 4x 2GB (overflow?) 1.5GB 5GB (error)
combined capacity error 10GB error
Virtual Array = 4 GB
Card pci-e bridge spec maximized bandwidth maximized capacity default
GTX1080ti(11GB-vram) 2.0 16x 4GB/s 3.3GB/s 1GB/s
GT1030 (2GB-vram) 2.0 4x 1GB/s 0.7GB/s 1GB/s (possible overflow!)
combined bandwidth 5GB/s 4GB/s 2GB/s
Efficiency of symmetrical or optimized gpu setups (bigger card plugged to faster bridge instead) may not be affected much.