Skip to content

I/O Tensors are dynamically allocated #39

@Victor-Jung

Description

@Victor-Jung

I/O tensors are allocated in the InitNetwork function and never deallocated (hence, they basically have an infinite lifetime). I/O tensor dimensions are known at compile time, and we should allocate and deallocate them

For the first time, I/we should display them in the memory allocation visualization and make sure to raise an error if we go above the memory capacity limit. Then, we should perform static memory allocation for them.

This is especially an issue for Llama where the KV cache is considered as an input and then an output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions