-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow MLflow URI with scheme #839
Allow MLflow URI with scheme #839
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @regen100 ,
Thanks a lot for taking a first stab at this one. This is certainly something that other MLflow users have asked in the past.
In general, we try to shy away from downloading artefacts in MLServer, as we see it as more of an infrastructure concern (i.e. something that is better handled by orchestrators like Seldon Core or KServe). However, it's true that it could make sense to allow downloads from the MLflow model registry in certain cases.
Having said that, I've added a couple questions and suggestions below. It would be great to hear your thoughts on those!
runtimes/mlflow/tests/conftest.py
Outdated
return ModelSettings( | ||
name="mlflow-model", | ||
implementation=MLflowRuntime, | ||
parameters=ModelParameters(uri="file:" + model_uri), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding a new fixture, should we parametrise one of the tests to run with file://
or no-schema. That way we add less boilerplate code to maintain later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the tests by pytest parameterization.
model_uri = await get_model_uri( | ||
self._settings, | ||
allowed_schemes=mlflow.store.artifact.artifact_repository_registry._artifact_repository_registry._registry, | ||
) | ||
self._model = mlflow.pyfunc.load_model(model_uri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have much experience downloading things from dbfs://
/ s3://
using MLflow, but would load_model()
just handle everything else? How would we configure the DBFS / S3 host, credentials, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of S3 URLs, MLflow retrieves the artifact through boto3
, which retrieves the credentials from an environment variables AWS_ACCESS_KEY_ID
or the default credential store ~/.aws
.
Your policy is reasonable. Downloading artifacts introduces complexity such as cache control. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good! Nice one defaulting the scheme on the urlparse()
method 👍
Once the tests are green, this should be good to go.
MLflow can load many schemes. However,
get_model_uri
accepts only local file paths.This PR enables
MLflowRuntime
to load models by URIs such ass3://...
andfile:...
.