MlflowModelSaverDataSet and MlflowArtifactDataset saves to wrong directory after pulling from repo. #456
-
I when i pulled a project from repo and tried to run it threw a "[WinError 5] Access is denied 'path/to/project/on/different/pc' ", when saving MlflowModelSaverDataSet and MlflowArtifactDataset. Other files saved by kedro in the pipeline prior, generated properly with no errors. Prior to running i recreated conda enviroment from file and ran kedro mlflow init. How does kedro-mlflow infer the project directory and can i somehow change it? Traceback: The above exception was the direct cause of the following exception: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ catalog.yml for data that fails to save is as follows: classificator: When i remove these lines kedro runs with no issues, but model is obviously not saved, so it has to be a problem with how model is being saved. package info: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @ciaciura , sorry I didn't see this notification. it's better to reach me in the issues rather than the github discussion. This bug is extremely weird because I don't see how it can "remember" the previous path, it must be stored somewhere. The error comes from mlflow internal. It seems that mlflow does not manage to copy your model to a local "temp folder" before uploading it or something like that.
classificator:
type: kedro_mlflow.io.models.MlflowModelSaverDataSet
flavor: mlflow.sklearn
filepath: data/06_models/classificator This should at least store the model locally. |
Beta Was this translation helpful? Give feedback.
Found it! I guess you refer to this repo: https://github.com/ciaciura/penguins. You have push your local
mlruns
folder so now mlflow will read the artifacts from the "permanent" location where you stored them. This can be seen here : https://github.com/ciaciura/penguins/blob/main/mlruns/0/meta.yaml. I guess this is the path in your old computer.You should not push your
mlruns
folder which is only made for local persistence (add it to your.gitignore
file). If you want to share your mlflow runs between different computers, you must set up a remote mlflow server.