Closed
Description
Description
Stagger calls to __init__()
when processes_per_replica
> 1 by default.
Motivation
If __init__()
requires more memory that API needs at its steady state, staggering the calls to __init__()
will reduce the peak memory consumption, potentially avoiding OOM errors and/or container evictions.
Short-term workaround
The user can use a file lock (our implementation or this pip package). If using ours, it can be imported directly, or the file can be copied into the predictor's directory (since in the import path could change in the future). Here is an example of importing the file lock from Cortex:
from cortex.lib.storage.concurrency import FileLock
class PythonPredictor:
def __init__(self, config):
with FileLock("/init.lock"):
# init code goes here
Implementation
A file lock can be used to achieve this (see above)