Note: this content has not been updated since MLflow 1.28.0 (Aug 11, 2022).
Unfortunately there is only the Javadoc. There are no examples in MLflow examples github.
You can find some Scala examples at mlflow-examples - scala-spark.
MLflow has a Java client that can be accessed from Scala.
Sample Scala code using the Java client: github.com/amesar/mlflow-examples/tree/master/scala/sparkml.
- Since much of MLflow functionality is client-based and is written in Python, there is a feature gap for other languages.
- Standard MLflow features such as MLflow projects, models and flavors are not supported for Java/Scala.
- This is principally due less demand for JVM-based ML training vs Python.
- You can save your native model as a raw artifact but cannot log it as a managed MLflow model.
- See item below.
No. With the Java client you have to save your models as un-managed artifacts using logArtifact. There is no concept of MLflow Python’s log_model (e.g. mlflow.sklearn.log_model which implies flavors. See example in TrainWine.scala.
See MLflowUtils.getOrCreateExperimentId.
// Return the ID of an experiment - create it if it doesn't exist
def getOrCreateExperimentId(client: MlflowClient, experimentName: String) = {
try {
client.createExperiment(experimentName)
} catch {
case e: org.mlflow.client.MlflowHttpException => { // statusCode 400
client.getExperimentByName(experimentName).get.getExperimentId
}
}
}
Works only for SparkML (MLlib) models.
Read the model artifact with the downloadArtifacts
method.
import org.apache.spark.ml.PipelineModel
import org.mlflow.client.MlflowClient
val client = new MlflowClient()
val modelPath = client.downloadArtifacts(runId, "spark-model/sparkml").getAbsolutePath
val model = PipelineModel.load(modelPath.replace("/dbfs","dbfs:"))
val predictions = model.transform(data)
Works only for SparkML (MLlib) models.
Do the same as above using the Python MlflowClient.download_artifacts
method.
MLflow allows you to search for a subset of MLflow objects. The MLflow search filter is a simplified version of the SQL WHERE clause.
You can search for the following MLflow objects:
- Runs
- Registered Models
- Versions of a Registered Model
General
- Search — MLflow 1.26.1 documentation - Detailed description of the filter syntax for run search.
- search_runs - returns a list of Pandas DataFrames.
- Note: no description or link of/to filter syntax. Just some examples.
- search_runs - returns a paged list of Run objects.
- Note: no description or link of/to filter syntax. Just some examples.
- search_registered_models - returns a paged list of RegisteredModel objects.
- filter_string – Filter query string, defaults to searching all registered models. Currently, it supports only a single filter condition as the name of the model, for example, name = 'model_name' or a search expression to match a pattern in the registered model name. For example, name LIKE 'Boston%' (case sensitive) or name ILIKE '%boston%'.
- search_model_versions - returns a paged list of ModelVersion objects.
- filter_string – A filter string expression. Currently, it supports a single filter condition either a name of model like name = 'model_name' or run_id = '...'.
Available now in MLflow 1.28.0.