Make Endpoint.predict
method async
#1998
Labels
api: vertex-ai
Issues related to the googleapis/python-aiplatform API.
type: feature request
‘Nice-to-have’ improvement, new feature or different behavior or design.
Problem
I want to request predictions on my image classifier endpoint. Since there is a limit of 1.5 MB per request, if I want to get predictions for several images I have to do the following:
But obviously, this way I cannot benefit from having multiple replicas, for example an deployed model with
min_replica_count=2
. So I change to this:Workaround
I can solve this by changing the
endpoint.predict
line to:But I think there should be an
async_predict
method, or maybe there should be a parametersync: bool
that would make the call blocking or non-blocking depending on the parameter.The text was updated successfully, but these errors were encountered: