Skip to content

Reuse npse_trained_model fixure for test_npse_map #1461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 55 commits into
base: main
Choose a base branch
from

Conversation

vagechirkov
Copy link
Collaborator

@vagechirkov vagechirkov requested a review from janfb March 17, 2025 16:13
@vagechirkov vagechirkov self-assigned this Mar 17, 2025
Copy link

codecov bot commented Mar 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.70%. Comparing base (a1d7555) to head (7276ad4).
Report is 16 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1461       +/-   ##
===========================================
- Coverage   89.60%   78.70%   -10.90%     
===========================================
  Files         121      127        +6     
  Lines        9357    10025      +668     
===========================================
- Hits         8384     7890      -494     
- Misses        973     2135     +1162     
Flag Coverage Δ
unittests 78.70% <ø> (-10.90%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 58 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!
See question in the comments.

What is the speed up you notice through this change?

@janfb
Copy link
Contributor

janfb commented Mar 18, 2025

CI is failing due to testmon error. this will be fixed in a separate PR

@vagechirkov
Copy link
Collaborator Author

What is the speed up you notice through this change?

Basically, we save the time of training a model specifically for `test_npse_map'. But now it would be much cheaper to add more tests that can reuse specific trained models.

@janfb
Copy link
Contributor

janfb commented Mar 18, 2025

One concern I have @vagechirkov is that now all the tests here are marked as slow, right?

Is there a way to have one of the c2st tests "not slow"? e.g., the one with the default settings we use in the class?

@vagechirkov
Copy link
Collaborator Author

Hey @janfb, I think this is more or less how the final test structure should look like, so ready for review 🙃

@vagechirkov vagechirkov requested a review from janfb March 20, 2025 10:18
@janfb
Copy link
Contributor

janfb commented Mar 20, 2025

Hey @janfb, I think this is more or less how the final test structure should look like, so ready for review 🙃

Cool, I will review it now.

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot @schroedk and @vagechirkov ! 👏

it all looks very clean and efficient now. Added a couple of questions for my understanding and some suggestions

@vagechirkov
Copy link
Collaborator Author

An issue with snapshot tests: syrupy-project/syrupy#438

@schroedk
Copy link
Collaborator

@vagechirkov @janfb my suggestion for this PR would be:

  • revert to the previous version of the test file linearGaussian_npse_test.py
  • add pytest.mark.slow to the tests
  • add the regression test (rename from test_snapshot -> linearGaussian_npse_test_regression.py)
  • remove the ambr file (and also clean the history, as I forgot to put it into lfs, my bad:(...)

This would not show the refactoring of the test (which we aimed for at the beginning), but at least it shows the usage of pytest-regression. What do you think?

@vagechirkov
Copy link
Collaborator Author

vagechirkov commented Mar 21, 2025

Alright, I think I’ve finally tracked down the issue 🎉🫠 (was seriously starting to question my sanity). Turns out it’s all about the combination of parameters — nothing to do with how we wrote the tests 🤪 I used the pytest-timeout plugin with @pytest.mark.timeout(60) to catch it. These are the tests that take longer than 60 seconds to complete:

  • uniform_prior-dim_3-ve-fnpe-sde-trials_8
  • uniform_prior-dim_3-ve-auto_gauss-sde-trials_8
  • uniform_prior-dim_3-ve-jac_gauss-sde-trials_8
  • uniform_prior-dim_3-vp-fnpe-sde-trials_8
  • uniform_prior-dim_3-subvp-fnpe-sde-trials_8

This means, in turn, that this seems to be a separate issue. What do you think, @janfb?

P.S.
The reason we didn’t hit this issue in test_spnapshot.py is because there’s actually a bug there — we’re not really using the number of trials 🙈 @schroedk

@vagechirkov
Copy link
Collaborator Author

Here is the longer list of slow regression tests:

image

@vagechirkov
Copy link
Collaborator Author

Very nice refactoring, thanks @schroedk! Regression tests run locally for me in 4min without the problem reported here: #1473

Comment on lines +159 to +161
# ToDO check if there is a bug for prior_type='uniform' and num_trial>1
# ToDO validate bug for combination sde_type='ode' and num_trial>1
# ToDO investigate non-determinism for prior_type=None, dim>= 2, ve, auto/jac_gauss
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we should mark these tests (non-deterministic and those that take too long) as xfail and set the timeout to, say, 20 seconds. This way, we would cover all possible combinations with the regression tests, and failing tests could be resolved later in the separate PRs.

@vagechirkov vagechirkov requested a review from janfb March 24, 2025 09:20
* extract map test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants