-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(huggingface): Load local artefacts in HuggingFace runtime #1319
feat(huggingface): Load local artefacts in HuggingFace runtime #1319
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! Changes look good from my side - this should be good to go once tests are green. 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions to improve clarity and UX, but nothing major that I can see
if not model: | ||
if settings.parameters is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ What happens if neither the pretained_model
nor the parameters
are specified? Previously, the pretrained_model
was mandatory and, presumably, the code would blow up if it wasn't there. Now we're saying both are optional, but I can't see clear error handling here to inform the user that something needs to be provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a UX consideration, might we also want to clarify the decision being made here? That is to say that, rather than using a default and only checking something else if not provided, we instead check if both are provided and explain that one overrides the other, or that the one hasn't been provided so we're using the other instead.
That's slightly more code, but shouldn't add appreciable time cost to loading models and should make it easier for users to understand why things are happening the way they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of the huggingface runtime, settings.parameters
is always not empty, otherwise this function get_huggingface_settings will raise an exception. And also, settings.parameters.uri is always present(if it's not present in the model-settings.json
, it will be set to the default model path by the registry load method).
Let me know if that makes sense to you, if not I am open to change 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying. Makes sense that we need certain parameters and validate this first. The links are very helpful, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One follow-up question: if settings.parameters
is always provided, there should be no need to check if settings.parameters is not None
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct. I can remove it to avoid confusion 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to further support the claim that settings.parameters
is always provided:
In case there are no model-setting.json
files found from the current path, there is a fallback to create model settings from the environment, where also it is the case that model params are initialised and therefore not empty.
@@ -1,6 +1,8 @@ | |||
from unittest.mock import patch, MagicMock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Is there a particular order defined for imports? I can't see grouping by standard library vs. third-party groupings, so not sure if there is an expectation around this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant for blocks of imports, rather than imports within a single line. It's okay if not, but wanted to clarify what I meant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so we are using a tool called black
that formats the files and controls whether an import should be 1 or a multi-line block depending on the set line length. If it's too long it will format it into a multi-line block. In this case, after running black
locally, I can confirm it is correct to be 1 line. In the tests, the code is also checked by black
to verify correct formatting and it confirmed this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, black
settings are configured from pyproject.toml in the root folder under [tool.black]
section. In this case, the line length is not set so it's using a default length of 88 chars.
if not model: | ||
if settings.parameters is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of the huggingface runtime, settings.parameters
is always not empty, otherwise this function get_huggingface_settings will raise an exception. And also, settings.parameters.uri is always present(if it's not present in the model-settings.json
, it will be set to the default model path by the registry load method).
Let me know if that makes sense to you, if not I am open to change 👍
@@ -1,6 +1,8 @@ | |||
from unittest.mock import patch, MagicMock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @vtaskow ! Thanks for making that change - looks great from my side.
@@ -55,6 +55,17 @@ MLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true | |||
``` | |||
```` | |||
|
|||
### Loading models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This is a clear, concise section - I like it!
], | ||
) | ||
@patch("mlserver_huggingface.common._get_pipeline_class") | ||
def test_pipeline_was_initialised_when_pretrained_model_is_not_supplied( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😕 I'm confused by these tests for two reasons. While the actual logic makes sense, the test names are verbose (what does "was initialised" mean in the context of the tests?) and the test logic is identical between the two tests bar the value for pretrained_model
as far as I can see.
Is this not clearer as a single, tabulated test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the test logic, I've refactored it to be more concise. About the test name: pipeline_is_initialised, I mean that when pipeline(name, model, ....) constructor is called, the correct model param is passed.
@@ -1,6 +1,8 @@ | |||
from unittest.mock import patch, MagicMock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant for blocks of imports, rather than imports within a single line. It's okay if not, but wanted to clarify what I meant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still a couple of questions about structure/format but, as before, functionally this looks simple and correct
Why
Currently, it is not convenient to load local model artefacts in the HuggingFace runtime. In the scenario when users would like to specify a custom model or perhaps would like to avoid downloading a model while mlserver operates, there must be a convenient way to specify a location to the desired model artefacts folder.
Furthermore, there is a way to specify such a location now by using the
parameters.extra.pretrained_model
inmodel-settings.json
config. However, the intention of this field is to be used for ready models from the HuggingFace hub.What and How
This PR provides a new way for a user to use a local HuggingFace model, not having to download a ready one from the HuggingFace hub or to limit themselves to those available there. Essentially, if the
parameters.extra.pretrained_model
is not specified inmodel-settings.json
, thenparameters.uri
will be used to locate and load a custom model that is already downloaded by rclone. Furthermore,parameters.uri
if not set, it's automatically filled by the HuggingFace runtime to point to the default location for downloading models in MLServer.Testing in Seldon Core v1
Background
PREDICTIVE_UNIT_PARAMETERS is an environment variable containing a list of parameters describing a HuggingFace model. It is the predecessor or
model-settings.json
. In SCv1, if this variable is not empty, it is used instead ofmodel-settings.json
(even if supplied).PREDICTIVE_UNIT_PARAMETERS is empty AND
model-settings.json
is missingmodel-settings.json
; no UI params setResult: raises MissingHuggingFaceSettings exception because there are no predictive params set or
model-settings.json
EVEN THOUGH MLSERVER_MODEL_URI is auto set to /mnt/modelsPREDICTIVE_UNIT_PARAMETERS is not empty AND
model-settings.json
is missingPREDICTIVE_UNIT_PARAMETERS contained: task = text-generation; MLSERVER_MODEL_URI is auto set to /mnt/models
Result: able to initialise HF settings, params has task = text-generation; HF settings -> pretrained_model is not set but falls back to MLSERVER_MODEL_URI - this is the change I am making in this PR, Able to predict ✔️
PREDICTIVE_UNIT_PARAMETERS contained: task = text-generation, pretrained_model = /mnt/models(the same as MLSERVER_MODEL_URI which is auto set to that value)
Result: able to initialise HF settings, params has task = text-generation and pretrained_model = /mnt/models which is being used as a location for the custom model. Able to predict ✔️
PREDICTIVE_UNIT_PARAMETERS is empty and using
model-settings.json
model-settings.json
contained:Result: able to initialise HF settings, pretrained model is not used because it's empty, correctly falls back to using MLSERVER_MODEL_URI instead which points to /mnt/models where rclone has downloaded the model artefact folder. ✔️
Important to note: no models were downloaded on demand from the HuggingFace hub.
model-settings.json
contained:Result: able to initialise HF settings, pretrained model IS used and correctly points to /mnt/models where rclone has downloaded the model artefact folder ✔️
Important to note: no models were downloaded on demand from the HuggingFace hub.
model-settings.json
contained:Result: able to initialise HF settings, uri IS used and correctly points to /mnt/models where rclone has downloaded the model artefact folder ✔️
Important to note: no models were downloaded on demand from the HuggingFace hub.
Testing in Seldon Core v2
model-settings.json
is missingMessage: ModelFailed
Reason: Can't create model-settings from requirements [huggingface]
Result is expected ✔️
model-settings.json
doesn't haveparameters.extra
mlserver_huggingface.errors.MissingHuggingFaceSettings: Missing HuggingFace Runtime settings.
Result is expected ✔️
model-settings.json
doesn't haveparameters.extra
but hasparameters.uri
mlserver_huggingface.errors.MissingHuggingFaceSettings: Missing HuggingFace Runtime settings.
Result is expected ✔️
model-settings.json
has onlytask
field inparameters.extra
Result: It's correctly using the downloaded model in /mnt/agent/models because even though neither
parameters.uri
is set norparameters.extra.pretrained_model
, the runtime model registry by default fills inparameters.uri
to be themodel-settings.json
path. Able to predict correctly ✔️This is possible because of the fallback to
parameters.uri
ifparameters.extra.pretrained_model
is not set.model-settings.json
hastask
andpretrained_model
fields inparameters.extra
butpretrained_model
points to a HuggingFace hub modelResult: The specified pretrained model is correctly downloaded from the HuggingFace hub. Able to predict correctly ✔️
Further testing with model artefacts folder created from
transformers.pipeline.save_pipeline()
model-settings.json
model-settings.json
but with PREDICTIVE_UNIT_PARAMSmodel-settings.json
but with PREDICTIVE_UNIT_PARAMSmodel-setting.json
Resolves #1076