-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sherif akoush/mlsv 42/add alibi explain runtime #320
Sherif akoush/mlsv 42/add alibi explain runtime #320
Conversation
mlserver/codecs/numpy.py
Outdated
@@ -93,11 +93,11 @@ def encode(cls, name: str, payload: np.ndarray) -> ResponseOutput: | |||
) | |||
|
|||
@classmethod | |||
def decode(cls, request_input: RequestInput) -> np.ndarray: | |||
model_data = _to_ndarray(request_input) | |||
def decode(cls, v2_data: Union[RequestInput, ResponseOutput]) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! Do you think we should make this the new interface for the InputCodec
class? I was thinking that we could also include the opposite conversion, although that would probably require splitting the existing encode
method into encode_response
and encode_request
.
return np_codec.decode(v2_response.outputs[0]) # type: ignore # TODO: fix mypy and first output | ||
|
||
@abc.abstractmethod | ||
def _explain_impl(self, input_data: Any, settings: BaseSettings) -> Explanation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be any scenario where we wouldn't want to execute it asynchronously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no but I thought that implementation should not care about that and we get the logic and execute it async in base class
|
||
|
||
# TODO: is this inference response? | ||
def create_v2_from_any(data: Any, name: str) -> Dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we wrap this into a Codec? Conscious that it's something kind of temporary, but could be maybe useful in other cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, these are the type of components that we'll end up also needing in Tempo if we want to easily create clients that can convert to/from explainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is StringCodec.encode
that achieves the same behaviour, so I am using it instead now.
Decoding the outout (e,g in the case of tempo and in the case of alibi runtime for the underlying remote inference request is going to be a separate ticket as discussed).
8668c45
to
ae6aff1
Compare
+ add factory capability in alibi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @sakoush! I've added a few comments here and there, although they are mainly around code conventions. Functionally, all looks good 👍
I haven't managed to look at the tests yet, but will have a look as soon as I can.
mlserver/codecs/string.py
Outdated
name=data.name, | ||
datatype=data.datatype, | ||
shape=data.shape, | ||
data=data.data.__root__.decode("ascii"), # to allow json serialisation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this break serialisation into protobufs? Although we should probably leave that point outside of this PR TBH, as it's not clear how to make something compatible for both.
Besides that, should we decode it as utf8
instead of ascii
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not removed and we will just pass the List[str]
As discussed this will only work with json / REST
@@ -26,7 +26,7 @@ class Config: | |||
env_prefix = ENV_PREFIX_ALIBI_DETECT_SETTINGS | |||
|
|||
init_detector: bool = False | |||
detector_type: PyObject = "" | |||
detector_type: PyObject = "" # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set it to None
, would we also need the type: ignore
comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because we need also to set it as Optional
. this has been used elsewhere as well.
@@ -47,7 +47,8 @@ class AlibiDetectRuntime(MLModel): | |||
|
|||
def __init__(self, settings: ModelSettings): | |||
|
|||
self.alibi_detect_settings = AlibiDetectSettings(**settings.parameters.extra) | |||
extra = settings.parameters.extra # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that type: ignore
required? Is it because parameters
may be None
? If that's the case, we could check here whether it's set, and if not just default to an empty AlibiDetectSettings()
object (which could still fetch values from the environment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point.
integrated_gradients = _INTEGRATED_GRADIENTS_TAG | ||
|
||
|
||
def convert_from_bytes(output: ResponseOutput, ty: Optional[Type]) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the convert_from_bytes
method still needed here now that we've got the extra methods on the StringCodec
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used in tests to recover the data from explain calls.
This utility is not yet moved to StringCodec
. I though I will leave it as TODO given that it is not strictly required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Could you add a #TODO
comment so that it's clear why we decided to leave it here?
model_parameters: Optional[ModelParameters] = self.settings.parameters | ||
assert model_parameters is not None | ||
uri = model_parameters.uri # type ignore | ||
assert uri is not None, "uri has to be set" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a uri
in this case?
explain_parameters=explain_parameters, | ||
) | ||
# TODO: Convert alibi-explain output to v2 protocol, for now we use to_json | ||
return StringCodec.encode(payload=[explanation.to_json()], name="explain") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we call the output head explanation
instead of explain
?
# TODO: we probably want to validate the enum more sanely here | ||
# we do not want to construct a specific alibi settings here because | ||
# it might be dependent on type | ||
# although at the moment we only have one `AlibiExplainSettings` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a separate class to manage the current enum and tuple dict. This could help with abstracting some of those details there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved them to a different module for now.
@@ -0,0 +1,4 @@ | |||
tensorflow | |||
numpy | |||
nest_asyncio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Are you using nest_asyncio
? Although, is it only for tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only in tests for now.
runtimes/alibi-explain/mlserver_alibi_explain/alibi_dependency_reference.py
Outdated
Show resolved
Hide resolved
integrated_gradients = _INTEGRATED_GRADIENTS_TAG | ||
|
||
|
||
def convert_from_bytes(output: ResponseOutput, ty: Optional[Type]) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Could you add a #TODO
comment so that it's clear why we decided to leave it here?
TESTS_PATH = Path(os.path.dirname(__file__)) | ||
|
||
|
||
# TODO: how to make this in utils? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's annoying to remember adding this every time. Not sure whether it would be possible to have this on a pytest plugin or something similar?
"parallel_workers": 0, | ||
} | ||
|
||
model_settings_path.write_text(json.dumps(model_settings_dict, indent=4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! I didn't know write_text
existed.
request_input_result = codec.encode_request_input(name="foo", payload=decoded) | ||
assert response_output.datatype == request_input_result.datatype | ||
assert response_output.shape == request_input_result.shape | ||
assert response_output.data == request_input_result.data | ||
assert request_input_result.parameters.content_type == codec.ContentType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move these asserts
to a separate test case to ensure each test only tests one thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps although it feels fine in this case as we are checking the content of the encoded object.
this pattern is used elsewhere as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm usually more keen on having tests that only test a single thing. This is generally a good pattern within pytest (and general unit testing), as it also makes tests smaller and easier to maintain.
Don't want to hold off the PR just for this nitpick though, but I think we should split them at a some point down the line.
This PR adds the initial support to run
alibi
explain models usingmlserver
. We treat alibi explain models similar to inference models with respect to deployment.We support blackbox and whitebox explainers: https://github.com/SeldonIO/alibi/blob/master/doc/source/overview/white_box_black_box.ipynb
In the case of blackbox explainers, there should be an inference model that is deployed and exposed via an endpoint that implements the v2 protocol: https://github.com/kserve/kserve/tree/master/docs/predict-api/v2 for now we only support
REST
.In the case of whitebox explainers, the explainer needs to load the "full" model in the same process space as it needs access to the model internals (such as gradients in the case of
IntegratedGradients
)We have the ability to create an explainer on the fly with the correct config settings in addition to loading persisted explainer artifacts.
There are some edge cases that are yet to be sorted out so this is still considered experimental at this stage.