-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mlserver_huggingface settings device type #1486
Conversation
@adriangonz, please take a look when you have a chance. It's a small change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @geodavic! 🚀
Would it make sense adding a small test to make sure this behaviour doesn't change in the future? This could be as simple as loading a model with device
set to empty, -1
, or cpu
? And just asserting that the model loads correctly.
@adriangonz Thanks! I added a test that passes |
@adriangonz :) |
@sakoush if you don't mind taking a look, much appreciated! |
is there anything holding this back? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm. Many thanks for your contribution.
If using a
model-settings.json
of the following form:an error occurs when spinning up the server:
because the
load_in_8bit
model kwarg makes forces the model to be loaded with Accelerate, andHuggingFaceSettings.device
default value of -1 is passed. I tried simply passing"device": null
to the model settings json to prevent this default value, but this wasn't accepted because the schema is typed toint
.The solution is to expand the typing of
HuggingFaceSettings.device
to accept null (and string, while we're at it). Therefore I have updated the typing toOptional[Union[int,str]]
, which is very close to the type hint used in the underlying transformers pipeline.I have tested the above model settings json with this change and the server starts as expected.