Skip to content

UNCACHED memory allocation in Level Zero - Performance  #515

Closed
@jjfumero

Description

@jjfumero

This issue is a suggestion based on our findings.

Some of the examples in Level Zero are using the UNCACHED Flag : https://spec.oneapi.io/level-zero/latest/core/api.html#_CPPv438ZE_DEVICE_MEM_ALLOC_FLAG_BIAS_UNCACHED

For example here:
https://github.com/intel/compute-runtime/blob/master/level_zero/core/test/black_box_tests/zello_timestamp.cpp#L102

In TornadoVM we have run a lot of experiments with CACHED and UNCACHED and we saw that the CACHED version is up to 4x faster than the UNCACHED. As I understand, the UNCACHED flag can be used when buffers are streamed once, and not reused, so there is space in GPU's cache for other reusable buffers. Unfortunately, the Level Zero documentation does not warn about this. From our experience, this is very "error prune" since we were analyzing the number of threads and block of threads deployed, rather than how memory was allocated.

Does it make sense to add some documentation in the Level Zero examples to include this information? As well as in which situations developers may use the UNCACHED vs CACHED flag?

BTW, I am not sure if this is the right repo to file this issue or I should also open one in Level Zero.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions