-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe integration with MLflow #3856
Conversation
a8256cf
to
df922b7
Compare
Closes #3541. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
df922b7
to
24c25c2
Compare
Note to reviewers: The idea of this page is to complement
The idea is for this page to serve as brief collection of MLOps use cases, and to use this as a template for future integrations. I'm paging @stichbury as well because I was somewhat careless with the prose in certain parts. I |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth several times on how to structure this page. I like how I ended up making 1 section per use case but I agree there might be other ways. Awaiting for @stichbury's take on this. |
I found the sectioning useful but agree with @noklam that we usually break into simple and advanced usage. Can I suggest we do the same here and have the structure as follows, but will leave you to decide where the simple/advanced sections fall? Maybe something like this? However, I'm not that attached to this and if you want to stick with what you have, I'd say that's fine, but omit the basic second level "Use cases" header and promote all the following (currently 3rd level) to 2nd level. HeaderPrerequisitesSimple use casesTracking Kedro pipeline runs in MLflow using HooksArtifact tracking in MLflow using hooksAdvanced use casesComplete tracking of Kedro runs in MLflow using
|
My biggest gripe with this is that it seems wrong to declare custom hooks as "basic" and using I can totally see how someone starts with the custom hook ("basic"), then they start making it more complex because they need more functionality, and in the end it becomes way more difficult than just |
I can't work on your branch so I've forked and made a PR to commit back to it #3862 Please take a look, merge what you want, and I can review again when you have the entire page in your preferred final state (see comment about sectioning above). |
We agreed to do this 👍🏼 Will make the change today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @astrojuanlu , for the excellent manual. Everything is working well. It's a great starting point for exploring Kedro+MLFlow. I've left a few minor comments.
Thanks for the review @Galileo-Galilei 🙏🏼 will address your comments ASAP. |
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
I significantly reworked the order of the sections, but the content is largely the same. I think the flow is much nicer now - wouldn't have reached this stage without @Galileo-Galilei's insightful comments. Please do have a look again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really great, well done 🌟 I made just 2 very minor suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work, @astrojuanlu! 🚀🚀🚀 I really like the new description flow, starting from easy use cases and progressing to more complex ones like Hooks and Session management. I left a few minor comments.
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
$ kedro run --to-outputs=X_test,y_test | ||
... | ||
$ kedro run --from-nodes=evaluate_model_node --params mlflow_run_id=4cba84... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was somewhat clunky, dumped some thoughts in #3922
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but it's great that you added that section! It's an interesting functionality.
|
||
``` | ||
(.venv) $ pip install mlflow | ||
(.venv) $ mlflow ui --backend-store-uri ./mlflow_runs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not leaving the default location? Well, because it's called mlruns
, which could be conflated with https://github.com/mlrun/mlrun 😬 This requires us to write a mlflow.yml
in the first step, but I think it's not the end of the world
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually better to show the mlflow.yml
to the users :)
@Galileo-Galilei in the interest of putting this in front of users already and given that I addressed your initial comments, I'm going ahead with merging this - but if you spot any glaring mistakes or further areas for improvement in a post-merge review, do leave a comment and I'll send another PR with the amendments 🙏🏼 |
Sorry Juan, readthedocs preview did not render well in my phone for some unknown reasons so it was hard to review. It does work on the latest branch though, so I had a chance to read it. It's much easier to read now, thanks for the work! Two minor comments I'll address in a further PR:
|
Thanks!
Oh, I remembered #3765, and that we didn't follow suit with the starters... About the customisation, I was hesitating about it myself, see #3856 (comment) I'd rather want to avoid |
I really think this is a bad idea to create our own standard for many reasons:
I don't like the name either, but the point of having a standard is having everyone sticking to it and I think we should keeo this well established one (and I don't think having mlrun in the docs is a big deals, mlflow users usually know about mlruns folder) |
I see your points, fair enough. @Galileo-Galilei do you want to send the PR yourself? |
* Describe integration with MLflow Closes kedro-org#3541. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * [Docs] Update MLflow docs page in kedro-org#3856 (kedro-org#3862) * Some proposed edits Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Fix some Vale warnings Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> --------- Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Simplify MLflow launch Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add label to runtime params section Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Restructure MLflow document Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Apply suggestions from code review Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> * Apply suggestions from code review Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: bpmeek <bpmeek.developer@gmail.com>
* Describe integration with MLflow Closes kedro-org#3541. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * [Docs] Update MLflow docs page in kedro-org#3856 (kedro-org#3862) * Some proposed edits Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Fix some Vale warnings Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> --------- Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Simplify MLflow launch Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add label to runtime params section Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Restructure MLflow document Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Apply suggestions from code review Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> * Apply suggestions from code review Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: bpmeek <bpmeek.developer@gmail.com>
See #3541.
Description
Development notes
Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file