-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sherif akoush/quickfix/get names from metadata endpoint #393
Changes from 13 commits
0749726
c0fd7dd
115f710
7d32f0f
2723230
ef3101e
faf7b37
d4640d0
1a63a36
3e5cc5e
157f437
540a11b
aac6bcb
164293e
20eb12c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,3 +46,6 @@ mlruns | |
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# alibi .data | ||
runtimes/alibi-explain/tests/.data/** |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = "0.6.0.dev3" | ||
__version__ = "0.6.0.dev4" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = "0.6.0.dev3" | ||
__version__ = "0.6.0.dev4" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
import asyncio | ||
import contextvars | ||
import functools | ||
import re | ||
from asyncio import AbstractEventLoop | ||
from importlib import import_module | ||
from typing import Any, Optional, Type, Callable, Awaitable, Union, List | ||
|
@@ -16,8 +17,13 @@ | |
InferenceResponse, | ||
InferenceRequest, | ||
Parameters, | ||
MetadataModelResponse, | ||
) | ||
|
||
_DEFAULT_ID_NAME = "explain_inference" | ||
|
||
_DEFAULT_INPUT_NAME = "predict" | ||
|
||
EXPLAINER_TYPE_TAG = "explainer_type" | ||
|
||
_MAX_RETRY_ATTEMPT = 3 | ||
|
@@ -41,7 +47,7 @@ def convert_from_bytes(output: ResponseOutput, ty: Optional[Type]) -> Any: | |
return literal_eval(py_str) | ||
|
||
|
||
# TODO: add retry | ||
# TODO: add retry and better exceptions handling | ||
def remote_predict( | ||
v2_payload: InferenceRequest, predictor_url: str | ||
) -> InferenceResponse: | ||
|
@@ -51,6 +57,19 @@ def remote_predict( | |
return InferenceResponse.parse_raw(response_raw.text) | ||
|
||
|
||
def remote_metadata(url: str) -> MetadataModelResponse: | ||
"""Get metadata from v2 endpoint""" | ||
response_raw = requests.get(url) | ||
if response_raw.status_code != 200: | ||
raise RemoteInferenceError(response_raw.status_code, response_raw.reason) | ||
return MetadataModelResponse.parse_raw(response_raw.text) | ||
|
||
|
||
def construct_metadata_url(infer_url: str) -> str: | ||
"""Construct v2 metadata endpoint from v2 infer endpoint""" | ||
return re.sub(r"/infer$", "", infer_url) | ||
|
||
|
||
# TODO: this is very similar to `asyncio.to_thread` (python 3.9+), | ||
# so lets use it at some point. | ||
def execute_async( | ||
|
@@ -82,17 +101,49 @@ def import_and_get_class(class_path: str) -> type: | |
return klass | ||
|
||
|
||
def to_v2_inference_request(input_data: Union[np.ndarray, List[str]]): | ||
def to_v2_inference_request( | ||
input_data: Union[np.ndarray, List[str]], | ||
metadata: Optional[MetadataModelResponse], | ||
) -> InferenceRequest: | ||
""" | ||
Encode numpy payload to v2 protocol. | ||
|
||
Note: We only fetch the first-input name and the list of outputs from the metadata | ||
endpoint currently. We should consider wider reconciliation with data types etc. | ||
|
||
Parameters | ||
---------- | ||
input_data | ||
Numpy ndarray to encode | ||
metadata | ||
Extra metadata that can help encode the payload. | ||
""" | ||
|
||
# MLServer does not really care about a correct input name! | ||
input_name = _DEFAULT_INPUT_NAME | ||
id_name = _DEFAULT_ID_NAME | ||
outputs = [] | ||
|
||
if metadata is not None: | ||
if metadata.inputs: | ||
# we only support a big single input numpy | ||
input_name = metadata.inputs[0].name | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we also need to pass on the datatype, shape and parameters from the metadata? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the type and shape is inferred from the payload. I think we can add a warning it they dont match? I didnt want to get into converting the payload according to the metadata as part of this ticket? |
||
if metadata.outputs: | ||
outputs = metadata.outputs | ||
|
||
# For List[str] (e.g. AnchorText), we use StringCodec for input | ||
input_payload_codec = StringCodec if type(input_data) == list else NumpyCodec | ||
v2_request = InferenceRequest( | ||
id=id_name, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not 100% sure, but I think the executor will set this to a unique value. Therefore, should we leave the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure about SC, maybe this is the case. However locally it will not work with a None id (for triron case). |
||
parameters=Parameters(content_type=input_payload_codec.ContentType), | ||
# TODO: we probably need to tell alibi about the expected types to use | ||
# or even whether it is a probability of classes or targets etc | ||
inputs=[ | ||
input_payload_codec.encode_request_input( # type: ignore | ||
name="predict", payload=input_data | ||
name=input_name, payload=input_data | ||
) | ||
], | ||
# set outputs as empty list, this will return everything? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that the protocol itself defines what should happen, so I guess that the behaviour will depend on the model and model runtime. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually this comment is not valid anymore as we get that from the metadata. |
||
outputs=outputs, | ||
) | ||
return v2_request |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = "0.6.0.dev3" | ||
__version__ = "0.6.0.dev4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it matters much, but should we make this call asynchronous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats a good point. having a look at the caller
load
, it has alsoload_explainer
that we need to do async so that the entire function (i.eload
) is truly async.I am tempted however to leave it as a separate ticket?