-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC california housing example #308
DOC california housing example #308
Conversation
To quote: The goal of this exercise is to go through a semi-realistic data science and machine learning task and develop a practical solution for it. We will learn about the following topics: - Perform *exploratory data analysis* - Do some non-trivial *feature engineering* - Explain how the feature engineering informs the *choice of machine learning model* and vice versa - Show how to make use of a couple of *advanced scikit-learn* features and explain why we use them - Create a *model card* that provides useful information about the model - Share the model by uploading it to the *Hugging Face Hub*
@skops-dev/maintainers ready for review It was unfortunately quite some work to convert this from ipynb to this percent-py format. Hopefully, I didn't miss anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to ship this, it's pretty nice!
examples/plot_model_card.py
Outdated
@@ -158,7 +158,7 @@ | |||
model_card.add_permutation_importances( | |||
importances, | |||
X_test.columns, | |||
plot_file=Path(local_repo) / "importance.png", | |||
plot_file=str(Path(local_repo) / "importance.png"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_permutation_importances
expects plot_file
to be str.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we be fixing that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replied to your comments, please review again.
examples/plot_model_card.py
Outdated
@@ -158,7 +158,7 @@ | |||
model_card.add_permutation_importances( | |||
importances, | |||
X_test.columns, | |||
plot_file=Path(local_repo) / "importance.png", | |||
plot_file=str(Path(local_repo) / "importance.png"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_permutation_importances
expects plot_file
to be str.
Previously, it was annotated as only taking str.
RTD failing? |
For posterity, a quick summary of what caused the bug: When sphinx-gallery runs the examples, it seems to use a single process for all. Because of that, sklearnex's
When running the new example in isolation, the bug does not occur because patching is not happening. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the debugging @BenjaminBossan , this was nasty to deal with.
cc @ahuber21, @napetrov, your example is causing very hard to debug issues. We're removing relevant pieces from the example.
examples/plot_intelex.py
Outdated
# in faster inference times, since loading a persisted model will always load | ||
# the objects exactly as they were saved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment section should also be removed, I think we should remove all mentions of patch_sklearn()
since it should never been used.
As an intermediate check, you could also call |
@BenjaminBossan - thanks for reporting - fixing this here - intel/scikit-learn-intelex#1224 |
To quote:
The goal of this exercise is to go through a semi-realistic data science
and machine learning task and develop a practical solution for it. We
will learn about the following topics:
learning model and vice versa
and explain why we use them - Create a model card that provides
useful information about the model