Skip to content

Large latency on Tensor allocation #313

Closed
@nebulorum

Description

@nebulorum

System information

Describe the documentation issue

Not sure this is the correct forum but I would like to some guidance on how to setup sessions and resource management would be interesting.

After two weeks trying to understand why latencies in 0.3.1 were completely uncontrollable (as compared to official 1.3.1) I ran into #208. This matches my observations.

We are trying to run Prediction on models with thousands of data points in different tensor per prediction. Memory allocation on the threads are in 20MB/s and there seems to be a sync between JavaCCP Allocation thread and our worker threads. In addition to this allocation using Size(1) tensor seems to be very slow (in the 7ms range).

After reading #208, it seems we are doing everything wrong. But I don't really have a clear picture of how it should be done: Would EagerSession help? Could I use a Session per HTTP request? Should I allocate larger multi-dimensional tensors instead of a single one? How should I configure thread pools? I understand that the API is work in progress, but current documentation is very light on this kind of documentation.

I don't think this is a bug, but I can convert into some other sort of issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions