UNCACHED memory allocation in Level Zero - Performance 

This issue is a suggestion based on our findings. 

Some of the examples in Level Zero are using the `UNCACHED` Flag : https://spec.oneapi.io/level-zero/latest/core/api.html#_CPPv438ZE_DEVICE_MEM_ALLOC_FLAG_BIAS_UNCACHED 

For example here:
https://github.com/intel/compute-runtime/blob/master/level_zero/core/test/black_box_tests/zello_timestamp.cpp#L102

In [TornadoVM](https://github.com/beehive-lab/TornadoVM) we have run a lot of experiments with `CACHED` and `UNCACHED` and we saw that the `CACHED` version is up to 4x faster than the `UNCACHED`. As I understand, the `UNCACHED` flag can be used when buffers are streamed once, and not reused, so there is space in GPU's cache for other reusable buffers. Unfortunately, the Level Zero documentation does not warn about this. From our experience, this is very "error prune" since we were analyzing the number of threads and block of threads deployed, rather than how memory was allocated. 

Does it make sense to add some documentation in the Level Zero examples to include this information? As well as in which situations developers may use the `UNCACHED` vs `CACHED` flag? 

BTW, I am not sure if this is the right repo to file this issue or I should also open one in Level Zero. 





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UNCACHED memory allocation in Level Zero - Performance #515

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

UNCACHED memory allocation in Level Zero - Performance #515

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions