-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles #1578
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1578 +/- ##
=======================================
Coverage 98.07% 98.07%
=======================================
Files 110 110
Lines 10010 10023 +13
=======================================
+ Hits 9817 9830 +13
Misses 193 193
Continue to review full report at Codecov.
|
break | ||
max_ind = xp.shape[1] | ||
min_val = fp[0] | ||
max_val = fp[max_ind - 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the shapes of xp
and fp
are indeed compatible then you should just be able to refer to the final index.
max_val = fp[max_ind - 1]
-> max_val = fp[-1]
As it stands, fp
size can be longer than xp.shape[1]
and your function will silently continue without trouble.
See above comment on sanity checking input arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
numba_installed = False | ||
|
||
|
||
class TestInterpolateMultipleRows(IrisTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this would be better named Test_interpolate_multiple_rows_same_y
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding a test which mocks both fast_interp_same_y
and slow_interp_same_y
to check which implementation is actually used when numba is available and which one when not (i.e. when called via interpolate_multiple_rows_same_y
). Happy to provide some pointers if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made an attempt at this, but I'm not very familiar with mocking, so suggestions for improvement are welcome.
@btrotta-bom I took your timing example from the ResamplePercentiles PR and used it to test this PR. I found the numba method to be around 25 times faster than the slow method, which is a significant speed up, making this very worthwhile. |
max_ind = xp.shape[1] | ||
min_val = fp[0] | ||
max_val = fp[-1] | ||
result = np.empty((xp.shape[0], len(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has lost the casting to float32. Otherwise I'm happy with these changes. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @btrotta-bom, thank you for the changes you made.
I was going make suggested changes in this PR but I thought it easier for us both if I make a PR with those changes targeted towards your branch instead.
My PR:
- Creates a public function
interpolate_multiple_rows_same_y
, documenting the conditions of change of behaviour between numba vs no numba availability. - mock is part of unittest in recent Python, not an independent library. So corrected this.
- Added an alternative (I think simpler) approach to testing what implementation is used when numba is available or not (in the testing).
Proposed review changes for metoppv#1578
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your for the contribution @btrotta-bom and indeed your patience! :)
Cheers
interpolate_multiple_rows_same_x
is already merged with master so you needn't worry about this now, butit might be worth adding test_slow_interp_same_x_called
and test_fast_interp_same_x_called
test methods similar to what was done to the testing of interpolate_multiple_rows_same_y
(ensuring which implementation is actually used with and without numba).
Thanks @cpelley and @bayliffe !
I will add an issue for this |
* master: Remove __repr__ methods from all neighbourhood plugins (#1648) ENH: Avoiding lazy loading in select command calls (#1617) MOBT-180: Weather symbol speed up (#1638) IM-1621: Make ECC error and warning tests more rigorous (#1641) Make flake8 report that it is okay when running improver-tests. (#1645) Update checksums after updating the title of files in apply-emos-coefficients/sites. (#1640) Fixes bug in spot-extraction for multi-time inputs (#1633) Updates checksums for threshold landmask fix (#1636) Update interpret-metadata (#1632) Weather code tree update (#1635) Fix noise in precip accumulation thresholds (#1627) Expanding on triangle time blending doc strings. (#1630) Better handling and documentation of dependencies (#1589) Add tests (#1626) Enhancements on new regridding code (#1560) Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles (#1578) Speed up interpolation in ensemble_copula_coupling.ResamplePercentiles (#1548) Spot-extraction additional coordinates ordering fix (#1610)
* upstream/master: Remove __repr__ methods from all neighbourhood plugins (#1648) ENH: Avoiding lazy loading in select command calls (#1617) MOBT-180: Weather symbol speed up (#1638) IM-1621: Make ECC error and warning tests more rigorous (#1641) Make flake8 report that it is okay when running improver-tests. (#1645) Update checksums after updating the title of files in apply-emos-coefficients/sites. (#1640) Fixes bug in spot-extraction for multi-time inputs (#1633) Updates checksums for threshold landmask fix (#1636) Update interpret-metadata (#1632) Weather code tree update (#1635) Fix noise in precip accumulation thresholds (#1627) Expanding on triangle time blending doc strings. (#1630) Better handling and documentation of dependencies (#1589) Add tests (#1626) Enhancements on new regridding code (#1560) Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles (#1578) Speed up interpolation in ensemble_copula_coupling.ResamplePercentiles (#1548) Spot-extraction additional coordinates ordering fix (#1610)
…esToPercentiles (metoppv#1578) * Add fast interpolation method * Style * Add missing file * Isort * Style * Style * Add licence and docstring * Mock numba in sphinx autodoc * Tell codecov to ignore numba_utilities.py * Add more tests * Tell black to ignore import in test * isort * Add noqa for spurious black result * Remove unused code * Fix bug * Remove warning test * Remove unused code * Remove unused import * Fix imports * Simplify code * Add a test * Change method names to distinguish from functionality for other open PR * Update type hints * Add test * Style * Calculate in 64-bit, output in 32-bit * Sort x * Add simple tests with known result * Style * Restore old code for handling unordered x * Add input checking * Test correct version is used depending on numba * Style * Fix imports * Style * Fix mocking * Cast result to float32 * MAINT: Proposed review suggestions * Fix merge * Fix interpolate_multiple_rows_same_x to use same approach as interpolate_multiple_rows_same_y * Style * Isort * Fix comment Co-authored-by: Belinda Trotta <btrotta-bom@users.noreply.github.com> Co-authored-by: cpelley <carwyn.pelley@metoffice.gov.uk>
* upstream/master: Remove __repr__ methods from all neighbourhood plugins (metoppv#1648) ENH: Avoiding lazy loading in select command calls (metoppv#1617) MOBT-180: Weather symbol speed up (metoppv#1638) IM-1621: Make ECC error and warning tests more rigorous (metoppv#1641) Make flake8 report that it is okay when running improver-tests. (metoppv#1645) Update checksums after updating the title of files in apply-emos-coefficients/sites. (metoppv#1640) Fixes bug in spot-extraction for multi-time inputs (metoppv#1633) Updates checksums for threshold landmask fix (metoppv#1636) Update interpret-metadata (metoppv#1632) Weather code tree update (metoppv#1635) Fix noise in precip accumulation thresholds (metoppv#1627) Expanding on triangle time blending doc strings. (metoppv#1630) Better handling and documentation of dependencies (metoppv#1589) Add tests (metoppv#1626) Enhancements on new regridding code (metoppv#1560) Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles (metoppv#1578) Speed up interpolation in ensemble_copula_coupling.ResamplePercentiles (metoppv#1548) Spot-extraction additional coordinates ordering fix (metoppv#1610)
Speeds up
ConvertProbabilitiesToPercentiles
by around 1.7x when tested on real data. Also can use multithreading for further speedups.Testing: