Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles #1578

btrotta-bom · 2021-10-07T06:08:17Z

Speeds up ConvertProbabilitiesToPercentiles by around 1.7x when tested on real data. Also can use multithreading for further speedups.

Testing:

[ X] Ran tests and they passed OK
[ X] Added new tests for the new feature(s)

codecov · 2021-10-07T06:14:15Z

Codecov Report

Merging #1578 (2e56664) into master (f05cc28) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1578   +/-   ##
=======================================
  Coverage   98.07%   98.07%           
=======================================
  Files         110      110           
  Lines       10010    10023   +13     
=======================================
+ Hits         9817     9830   +13     
  Misses        193      193

Impacted Files	Coverage Δ
...semble_copula_coupling/ensemble_copula_coupling.py	`99.04% <100.00%> (-0.01%)`	⬇️
improver/ensemble_copula_coupling/utilities.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f05cc28...2e56664. Read the comment docs.

cpelley · 2021-11-09T14:51:36Z

improver/ensemble_copula_coupling/numba_utilities.py

+            break
+    max_ind = xp.shape[1]
+    min_val = fp[0]
+    max_val = fp[max_ind - 1]


if the shapes of xp and fp are indeed compatible then you should just be able to refer to the final index.
max_val = fp[max_ind - 1] -> max_val = fp[-1]

As it stands, fp size can be longer than xp.shape[1] and your function will silently continue without trouble.
See above comment on sanity checking input arrays.

improver/ensemble_copula_coupling/numba_utilities.py

cpelley · 2021-11-09T15:08:39Z

improver_tests/ensemble_copula_coupling/test_utilities.py

+    numba_installed = False
+
+
+class TestInterpolateMultipleRows(IrisTest):


Perhaps this would be better named Test_interpolate_multiple_rows_same_y?

It might be worth adding a test which mocks both fast_interp_same_y and slow_interp_same_y to check which implementation is actually used when numba is available and which one when not (i.e. when called via interpolate_multiple_rows_same_y). Happy to provide some pointers if required.

I have made an attempt at this, but I'm not very familiar with mocking, so suggestions for improvement are welcome.

bayliffe · 2021-11-09T15:55:38Z

@btrotta-bom I took your timing example from the ResamplePercentiles PR and used it to test this PR. I found the numba method to be around 25 times faster than the slow method, which is a significant speed up, making this very worthwhile.

bayliffe · 2021-11-11T11:13:55Z

improver/ensemble_copula_coupling/numba_utilities.py

+    max_ind = xp.shape[1]
+    min_val = fp[0]
+    max_val = fp[-1]
+    result = np.empty((xp.shape[0], len(x)))


This has lost the casting to float32. Otherwise I'm happy with these changes. Thanks.

cpelley

Hi @btrotta-bom, thank you for the changes you made.
I was going make suggested changes in this PR but I thought it easier for us both if I make a PR with those changes targeted towards your branch instead.

btrotta-bom#1

My PR:

Creates a public function interpolate_multiple_rows_same_y, documenting the conditions of change of behaviour between numba vs no numba availability.
mock is part of unittest in recent Python, not an independent library. So corrected this.
Added an alternative (I think simpler) approach to testing what implementation is used when numba is available or not (in the testing).

Proposed review changes for metoppv#1578

…ate_multiple_rows_same_y

cpelley

Thanks for your for the contribution @btrotta-bom and indeed your patience! :)

Cheers

interpolate_multiple_rows_same_x is already merged with master so you needn't worry about this now, butit might be worth adding test_slow_interp_same_x_called and test_fast_interp_same_x_called test methods similar to what was done to the testing of interpolate_multiple_rows_same_y (ensuring which implementation is actually used with and without numba).

btrotta-bom · 2021-11-16T22:20:22Z

Thanks @cpelley and @bayliffe !

it might be worth adding test_slow_interp_same_x_called and test_fast_interp_same_x_called test methods similar to what was done to the testing of interpolate_multiple_rows_same_y (ensuring which implementation is actually used with and without numba).

I will add an issue for this

* master: Remove __repr__ methods from all neighbourhood plugins (#1648) ENH: Avoiding lazy loading in select command calls (#1617) MOBT-180: Weather symbol speed up (#1638) IM-1621: Make ECC error and warning tests more rigorous (#1641) Make flake8 report that it is okay when running improver-tests. (#1645) Update checksums after updating the title of files in apply-emos-coefficients/sites. (#1640) Fixes bug in spot-extraction for multi-time inputs (#1633) Updates checksums for threshold landmask fix (#1636) Update interpret-metadata (#1632) Weather code tree update (#1635) Fix noise in precip accumulation thresholds (#1627) Expanding on triangle time blending doc strings. (#1630) Better handling and documentation of dependencies (#1589) Add tests (#1626) Enhancements on new regridding code (#1560) Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles (#1578) Speed up interpolation in ensemble_copula_coupling.ResamplePercentiles (#1548) Spot-extraction additional coordinates ordering fix (#1610)

* upstream/master: Remove __repr__ methods from all neighbourhood plugins (#1648) ENH: Avoiding lazy loading in select command calls (#1617) MOBT-180: Weather symbol speed up (#1638) IM-1621: Make ECC error and warning tests more rigorous (#1641) Make flake8 report that it is okay when running improver-tests. (#1645) Update checksums after updating the title of files in apply-emos-coefficients/sites. (#1640) Fixes bug in spot-extraction for multi-time inputs (#1633) Updates checksums for threshold landmask fix (#1636) Update interpret-metadata (#1632) Weather code tree update (#1635) Fix noise in precip accumulation thresholds (#1627) Expanding on triangle time blending doc strings. (#1630) Better handling and documentation of dependencies (#1589) Add tests (#1626) Enhancements on new regridding code (#1560) Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles (#1578) Speed up interpolation in ensemble_copula_coupling.ResamplePercentiles (#1548) Spot-extraction additional coordinates ordering fix (#1610)

…esToPercentiles (metoppv#1578) * Add fast interpolation method * Style * Add missing file * Isort * Style * Style * Add licence and docstring * Mock numba in sphinx autodoc * Tell codecov to ignore numba_utilities.py * Add more tests * Tell black to ignore import in test * isort * Add noqa for spurious black result * Remove unused code * Fix bug * Remove warning test * Remove unused code * Remove unused import * Fix imports * Simplify code * Add a test * Change method names to distinguish from functionality for other open PR * Update type hints * Add test * Style * Calculate in 64-bit, output in 32-bit * Sort x * Add simple tests with known result * Style * Restore old code for handling unordered x * Add input checking * Test correct version is used depending on numba * Style * Fix imports * Style * Fix mocking * Cast result to float32 * MAINT: Proposed review suggestions * Fix merge * Fix interpolate_multiple_rows_same_x to use same approach as interpolate_multiple_rows_same_y * Style * Isort * Fix comment Co-authored-by: Belinda Trotta <btrotta-bom@users.noreply.github.com> Co-authored-by: cpelley <carwyn.pelley@metoffice.gov.uk>

* upstream/master: Remove __repr__ methods from all neighbourhood plugins (metoppv#1648) ENH: Avoiding lazy loading in select command calls (metoppv#1617) MOBT-180: Weather symbol speed up (metoppv#1638) IM-1621: Make ECC error and warning tests more rigorous (metoppv#1641) Make flake8 report that it is okay when running improver-tests. (metoppv#1645) Update checksums after updating the title of files in apply-emos-coefficients/sites. (metoppv#1640) Fixes bug in spot-extraction for multi-time inputs (metoppv#1633) Updates checksums for threshold landmask fix (metoppv#1636) Update interpret-metadata (metoppv#1632) Weather code tree update (metoppv#1635) Fix noise in precip accumulation thresholds (metoppv#1627) Expanding on triangle time blending doc strings. (metoppv#1630) Better handling and documentation of dependencies (metoppv#1589) Add tests (metoppv#1626) Enhancements on new regridding code (metoppv#1560) Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles (metoppv#1578) Speed up interpolation in ensemble_copula_coupling.ResamplePercentiles (metoppv#1548) Spot-extraction additional coordinates ordering fix (metoppv#1610)

btrotta-bom added 8 commits October 7, 2021 16:29

Add fast interpolation method

e98c6fc

Style

a890d16

Add missing file

568bb8e

Isort

cdb587e

Style

b629240

Style

fec6a17

Add licence and docstring

adfdcdd

Merge remote-tracking branch 'upstream/master' into fast-interpolate2

514a1a5

btrotta-bom added 12 commits October 7, 2021 17:14

Mock numba in sphinx autodoc

df9f62d

Tell codecov to ignore numba_utilities.py

6f6e387

Add more tests

24cf35c

Tell black to ignore import in test

9bdd45b

isort

43af274

Add noqa for spurious black result

f8ba3ec

Remove unused code

7e68c9d

Fix bug

bc202ea

Remove warning test

f6a9d59

Remove unused code

c34f6a9

Remove unused import

7436dcb

Fix imports

9713b78

btrotta-bom changed the title ~~[WIP] Speed up interpolation in ConvertProbabilitiesToPercentiles~~ Speed up interpolation in ConvertProbabilitiesToPercentiles Oct 7, 2021

btrotta-bom added the MO review required PRs opened by non-Met Office developers that require a Met Office review label Oct 7, 2021

zfan001 self-requested a review October 12, 2021 04:11

btrotta-bom added 2 commits October 13, 2021 10:53

Simplify code

b1c8add

Add a test

97a0f40

btrotta-bom changed the title ~~Speed up interpolation in ConvertProbabilitiesToPercentiles~~ Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles Oct 13, 2021

btrotta-bom added 3 commits October 13, 2021 11:46

Change method names to distinguish from functionality for other open PR

f2eaaff

Update type hints

20784f0

Add test

867f37b

cpelley requested changes Nov 9, 2021

View reviewed changes

btrotta-bom added 6 commits November 10, 2021 09:53

Add input checking

75f88a0

Test correct version is used depending on numba

945a21b

Style

b535a9b

Fix imports

91e9b9f

Style

ba624ad

Fix mocking

7e24763

bayliffe reviewed Nov 11, 2021

View reviewed changes

Cast result to float32

ad6c563

bayliffe previously approved these changes Nov 12, 2021

View reviewed changes

MAINT: Proposed review suggestions

62551b1

cpelley requested changes Nov 12, 2021

View reviewed changes

Merge pull request #1 from cpelley/1578_SUGGESTED_REVIEW_CHANGES

f803c1c

Proposed review changes for metoppv#1578

btrotta-bom dismissed bayliffe’s stale review via f803c1c November 15, 2021 22:50

btrotta-bom added 6 commits November 16, 2021 11:41

Merge

111f0a4

Fix merge

34d0ba8

Fix interpolate_multiple_rows_same_x to use same approach as interpol…

25c41cd

…ate_multiple_rows_same_y

Style

7a30eb5

Isort

c380752

Fix comment

2e56664

cpelley self-requested a review November 16, 2021 11:05

cpelley approved these changes Nov 16, 2021

View reviewed changes

btrotta-bom merged commit f6844af into metoppv:master Nov 16, 2021

btrotta-bom mentioned this pull request Nov 16, 2021

Add tests for interpolate_multiple_rows_same_x #1623

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles #1578

Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles #1578

btrotta-bom commented Oct 7, 2021 •

edited

Loading

codecov bot commented Oct 7, 2021 •

edited

Loading

cpelley Nov 9, 2021

btrotta-bom Nov 10, 2021

cpelley Nov 9, 2021

cpelley Nov 9, 2021

btrotta-bom Nov 10, 2021

bayliffe commented Nov 9, 2021

bayliffe Nov 11, 2021

btrotta-bom Nov 11, 2021

cpelley left a comment •

edited

Loading

cpelley left a comment

btrotta-bom commented Nov 16, 2021

		numba_installed = False


		class TestInterpolateMultipleRows(IrisTest):

Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles #1578

Speed up interpolation in ensemble_copula_coupling.ConvertProbabilitiesToPercentiles #1578

Conversation

btrotta-bom commented Oct 7, 2021 • edited Loading

codecov bot commented Oct 7, 2021 • edited Loading

Codecov Report

cpelley Nov 9, 2021

Choose a reason for hiding this comment

btrotta-bom Nov 10, 2021

Choose a reason for hiding this comment

cpelley Nov 9, 2021

Choose a reason for hiding this comment

cpelley Nov 9, 2021

Choose a reason for hiding this comment

btrotta-bom Nov 10, 2021

Choose a reason for hiding this comment

bayliffe commented Nov 9, 2021

bayliffe Nov 11, 2021

Choose a reason for hiding this comment

btrotta-bom Nov 11, 2021

Choose a reason for hiding this comment

cpelley left a comment • edited Loading

Choose a reason for hiding this comment

cpelley left a comment

Choose a reason for hiding this comment

btrotta-bom commented Nov 16, 2021

btrotta-bom commented Oct 7, 2021 •

edited

Loading

codecov bot commented Oct 7, 2021 •

edited

Loading

cpelley left a comment •

edited

Loading