[DML] Optimizations to improve DML memory utilization

The following (flame graph) is based on traces of {MobileNetv2, Resnet50v2, DirectMLSuperResolution}.

Issue 1. A large tensor resource being created takes 2x the wall-time due a re-attempt (https://github.com/intel/GPGMM/issues/180).

![image](https://user-images.githubusercontent.com/15808629/156822058-ec99caf3-ed1b-41ae-9bfd-3f4d2aeaad33.png)

First allocation attempts sub-allocation (blue), followed by another direct allocation (red-ish), about 2x slower. Should check the tensor size ahead of time and directly allocate it rather then attempt to sub-allocate then fallback.

Issue 2. Memory containing resources being created on-demand (https://github.com/intel/GPGMM/issues/110).

![image](https://user-images.githubusercontent.com/15808629/156822647-fd26a9fb-d4d5-44b7-8bdd-a66079f401f8.png)

Biggest cost of tensor creation is creating the heap. Should pre-fetch the next heap so a subsequent request is full-filled before being requested.

Issue 3. Tensors under utilizing memory.

Tensor allocations are 1MB whereas memory they occupy is 4MB (or 4x waste). My goal is to move away from having fixed/default memory sizes like we did in Dawn (now called `PreferredResourceHeapSize`) and to grow (or increase) them dynamically instead  (https://github.com/intel/GPGMM/issues/182).

![image](https://user-images.githubusercontent.com/15808629/156824826-abd66fb9-5f8f-4114-b3e6-dbc588e3a7f9.png)

FYI, @fujunwei, @huningxin  @RafaelCintron


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DML] Optimizations to improve DML memory utilization #221

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[DML] Optimizations to improve DML memory utilization #221

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions