-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix mlserver infer with BYTES #1213
fix mlserver infer with BYTES #1213
Conversation
for request_input in inference_request.inputs or []: | ||
new_input = httpclient.InferInput( | ||
request_input.name, request_input.shape, request_input.datatype | ||
) | ||
request_input_np = NumpyCodec.decode_input(request_input) | ||
|
||
# Change datatype if BYTES to satisfy Tritonclient checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the feeling this should be something handled in the NumpyCodec
but it may be too tritonclient
specific...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - do you know what's returned by NumpyCodec.decode_input
at the moment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tritonclient expects
elif np_dtype == np.object_ or np_dtype.type == np.bytes_:
return "BYTES"
return None
as otherwise it is None
that was causing an issue.
As we do not set anything there explicitly (and I also seen that by adding extra logs earlier during debugging) we just have '<U1'
In [2]: import numpy as np
In [3]: x = np.array(["x"])
In [4]: x.dtype == np.bytes_
Out[4]: False
In [5]: x.dtype
Out[5]: dtype('<U1')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's going is that the NumpyCodec encodes that list as a tensor of strings (i.e. using np.dtype(str)
). However, Triton doesn't know how to handle that case.
MLServer/mlserver/codecs/numpy.py
Lines 41 to 43 in d283fc0
if is_list_of(data, str): | |
# Handle special case of strings being treated as Numpy arrays | |
return np.dtype(str) |
I can't remember why we had to add that np.dtype(str)
special case, but I would be keen to keep it as is to avoid any potential side effects. Having said that, it's unclear whether changing the dtype
here may cause any issues downstream (although I'd expect that strings in np arrays are quite an edge case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @RafalSkolasinski! Added a small comment around tests - I think we can just reuse one of the existing fixtures, but everything else looks great. 👍
for request_input in inference_request.inputs or []: | ||
new_input = httpclient.InferInput( | ||
request_input.name, request_input.shape, request_input.datatype | ||
) | ||
request_input_np = NumpyCodec.decode_input(request_input) | ||
|
||
# Change datatype if BYTES to satisfy Tritonclient checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - do you know what's returned by NumpyCodec.decode_input
at the moment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tweaking the fixture @RafalSkolasinski ! This should now be ready to go once the linter stops complaining (seems to be a black
formatting issue - you can fix it with make fmt
).
@adriangonz done - should be fine now |
fix mlserver infer with BYTES
Closes #1210