Open
Description
📚 The doc issue
I set the value of default_response_timeout
to 4 i.e. 4 seconds. At the start of the model load, this happens after 4 (ish) seconds:
org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time
My guess is because the model takes a while to load (more than 4 seconds), the worker gets killed.
Is there a way to set a larger initial delay i.e. differentiate these two scenarios:
- account for the initial model load with a number different from
default_response_timeout
- if model doesn't response in
default_response_timeout
after the initial load, then kill the worker
Suggest a potential alternative/fix
No response