deploy/runtime: use a background thread to run GC when interpreters aren't executing the forward pass

To optimize the forward pass latency it would be good to time GC to run in between model executions. This won't improve the QPS since the GC cost is the same amoratized but it would make the latency lower per batch.

```python
import gc

gc.collect()
```

We should spin up a background thread that periodically iterates over all of the interpreter threads -- locks them between execution and runs the GC. It might also be worth it to explicitly disable GC on the individual interpreter threads so they _won't_ run during the forward pass.

Context:

https://fb.workplace.com/notes/538119557964077/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

deploy/runtime: use a background thread to run GC when interpreters aren't executing the forward pass #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

deploy/runtime: use a background thread to run GC when interpreters aren't executing the forward pass #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions