-
Notifications
You must be signed in to change notification settings - Fork 22
Description
The following (flame graph) is based on traces of {MobileNetv2, Resnet50v2, DirectMLSuperResolution}.
Issue 1. A large tensor resource being created takes 2x the wall-time due a re-attempt (intel/GPGMM#180).
First allocation attempts sub-allocation (blue), followed by another direct allocation (red-ish), about 2x slower. Should check the tensor size ahead of time and directly allocate it rather then attempt to sub-allocate then fallback.
Issue 2. Memory containing resources being created on-demand (intel/GPGMM#110).
Biggest cost of tensor creation is creating the heap. Should pre-fetch the next heap so a subsequent request is full-filled before being requested.
Issue 3. Tensors under utilizing memory.
Tensor allocations are 1MB whereas memory they occupy is 4MB (or 4x waste). My goal is to move away from having fixed/default memory sizes like we did in Dawn (now called PreferredResourceHeapSize) and to grow (or increase) them dynamically instead (intel/GPGMM#182).


