Description
Description
If configured, when a request is taken off of the queue for processing, if the time spent on the queue is greater than the configured timeout, the request would not be processed and an error would be returned (a 503 error is probably most appropriate).
This timeout could also be applied if the request is still waiting in the queue, or if the request is already being processed. This could be separated into separate issue(s).
For timing out requests that are currently in progress, it might be possible with async (e.g. via this). Question: assuming we figure out how to cap the thread pool when using async (which should be possible), will there be other unintended consequences of using async? Here is FastAPI's discussion.
Relevant question on gitter: https://gitter.im/cortexlabs/cortex?at=5fe1eea0dbb17f28c59329b0
May be related to #1453