Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix notebook tests #550

Merged
merged 19 commits into from
Apr 5, 2024
Merged

Fix notebook tests #550

merged 19 commits into from
Apr 5, 2024

Conversation

jakobkruse1
Copy link
Collaborator

@jakobkruse1 jakobkruse1 commented Mar 28, 2024

Description

Currently, many notebooks tests fail due to timeouts or other bugs.
This branch aims to solve this by doing the following:

  • Increase speed of the notebooks or increase timeout
  • Make successful notebook tests mandatory
  • Ensure that the entire job does not run for too long to not annoy developers (same runtime as test jobs)

Pytest has its timeout set to 300 seconds, running all 10 notebooks in 300 might be nice. Other option is to split the notebook-tests like for the regular tests.

Fixed notebook errors

  • data_oob: Timeout
  • influence_imagenet Timeout (only locally, not in CI)
  • influence_sentiment_analysis: RuntimeError and too slow in CI
  • least_core_basic: ValueError when solver does not converge (NaN)
  • least_core_basic: Too slow
  • msr_banzhaf_spotify and shapley_basic_spotify: KeyError 'Billie Eilish' happens sometimes (randomness)
  • shapley_utility_learning: Timeout

Checklist

  • If notebooks were added/changed, added boilerplate cells are tagged with "tags": ["hide"] or "tags": ["hide-input"]

@jakobkruse1 jakobkruse1 added bug Something isn't working CI GH actions for testing and packaging labels Mar 28, 2024
@jakobkruse1 jakobkruse1 self-assigned this Mar 28, 2024
@mdbenito
Copy link
Collaborator

@jakobkruse1 It's awesome that notebooks run again! Thanks for this.

I guess you'll have noticed, but just in case:

  • Notebooks should always run with less resources in CI. We check with os.getenv('CI') e.g. to retrieve just a fraction of the data, but we might also want to change some parameters, like number of iterations to a bare minimum. If a notebook fails to do this, it must be fixed.
  • Results from automatically ran notebooks don't really matter because we must manually run them with the whole data (could take hours for one notebook), and commit them rendered (which reminds me that we must move them to LFS once and for all, something that we should've done since the beginning 😒)

So, maybe one can instead check for CI or PYTEST_RUNNING (which we set with a flag when calling pytest) or something like that. Of course this should be a function in the support module for the notebooks, etc.

@jakobkruse1 jakobkruse1 marked this pull request as ready for review April 2, 2024 13:23
Copy link
Collaborator

@mdbenito mdbenito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Note that we use pytest-split for the other tests. Is it also active for the notebook tests? And if yes, shouldn't we update the .test_durations file?

notebooks/least_core_basic.ipynb Show resolved Hide resolved
notebooks/influence_sentiment_analysis.ipynb Show resolved Hide resolved
@jakobkruse1
Copy link
Collaborator Author

@mdbenito I am not using pytest-split for the notebooks. I am using os.getenv("CI") which keeps the runtime of the notebooks acceptable. Therefore, splitting is not required as of now. In the future, splitting may be necessary to keep the runtimes acceptable.

@mdbenito
Copy link
Collaborator

mdbenito commented Apr 3, 2024

What is "acceptable"? Wouldn't it make sense to have add pytest-split while you are at it. We will have more notebooks, probably many of them.

@jakobkruse1
Copy link
Collaborator Author

This PR is ready to merge from my side now @mdbenito @AnesBenmerzoug. Could somebody review it again? Also, I can't merge this, somebody else would have to do that :)

@AnesBenmerzoug
Copy link
Collaborator

@jakobkruse1 I just had a quick look at the CI run times for the notebook tests and they seem to vary wildly. You should run the notebook tests using tox -e notebook-tests -- --store-durations in order to record the duration of each notebook run so that they're split more evenly (See this section in the contributing docs for more information).

image
image

Other than that, the only thing I would look into would be the rendering of the notebooks in the documentation (You could check that by building the docs locally using mkdocs serve and navigate to the Examples section)

@jakobkruse1
Copy link
Collaborator Author

@jakobkruse1 I just had a quick look at the CI run times for the notebook tests and they seem to vary wildly. You should run the notebook tests using tox -e notebook-tests -- --store-durations in order to record the duration of each notebook run so that they're split more evenly (See this section in the contributing docs for more information).

image image

Other than that, the only thing I would look into would be the rendering of the notebooks in the documentation (You could check that by building the docs locally using mkdocs serve and navigate to the Examples section)

Done @AnesBenmerzoug

Copy link
Collaborator

@AnesBenmerzoug AnesBenmerzoug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I will merge it later this afternoon if Miguel doesn't leave any other comment.

@AnesBenmerzoug AnesBenmerzoug merged commit a2d3eae into develop Apr 5, 2024
35 checks passed
@mdbenito mdbenito deleted the fix-notebook-tests branch April 23, 2024 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CI GH actions for testing and packaging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants