Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace daily lists in HydrologyModel with array implementation #517

Open
davidorme opened this issue Jul 12, 2024 · 0 comments
Open

Replace daily lists in HydrologyModel with array implementation #517

davidorme opened this issue Jul 12, 2024 · 0 comments

Comments

@davidorme
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

At the moment, the hydrology model tracks daily values for variables within a time step by appending daily arrays to a list and then using np.stack to combine them at the end to get statistics across the time step.

# Create lists for output variables to store daily data
daily_lists: dict = {name: [] for name in self.vars_updated}
for day in np.arange(days):
# Interception of water in canopy, [mm]
interception = above_ground.calculate_interception(
leaf_area_index=hydro_input["leaf_area_index_sum"],
precipitation=hydro_input["current_precipitation"][:, day],
intercept_parameters=self.model_constants.intercept_parameters,
veg_density_param=self.model_constants.veg_density_param,
)
# TODO add canopy evaporation
# Precipitation that reaches the surface per day, [mm]
precipitation_surface = (
hydro_input["current_precipitation"][:, day] - interception
)
daily_lists["precipitation_surface"].append(precipitation_surface)

Since the number of days is known (and will always be known?), this can be done by declaring a days x grid cell numpy array and inserting daily rows. If the code below is a fair comparison of the two approaches then this is faster and cleaner. I have to admit I was expecting it to be more faster, but it is faster and the advantage greater as the number of cells increases.

In [1]: import timeit
   ...: import numpy as np
   ...: 
   ...: 
   ...: def stacking_arrays(n_cells=10, n_days=30):
   ...: 
   ...:     val = []
   ...:     for day in np.arange(n_days):
   ...:         val.append(np.arange(n_cells))
   ...: 
   ...:     return np.sum(np.stack(val, axis=1), axis=1)
   ...: 

In [2]: def matrix_insertion(n_cells=10, n_days=30):
   ...: 
   ...:     val = np.empty((n_days, n_cells))
   ...:     for day in np.arange(n_days):
   ...:         val[day, :] = np.arange(n_cells)
   ...: 
   ...:     return np.sum(val, axis=0)
   ...: 

In [3]: 
   ...: assert np.allclose(stacking_arrays(),  matrix_insertion())
   ...: 
   ...: number = 1000

In [4]: np.min(timeit.repeat("stacking_arrays()", globals=globals(), number=number))
Out[4]: np.float64(0.02308262512087822)

In [5]: np.min(timeit.repeat("matrix_insertion()", globals=globals(), number=number))
Out[5]: np.float64(0.0157989589497447)

In [6]: np.min(timeit.repeat("stacking_arrays(n_cells=1000)", globals=globals(), number=number))
Out[6]: np.float64(0.05126216681674123)

In [7]: np.min(timeit.repeat("matrix_insertion(n_cells=1000)", globals=globals(), number=number))
Out[7]: np.float64(0.03652237495407462)

In [8]: np.min(timeit.repeat("stacking_arrays(n_cells=5000)", globals=globals(), number=number))
Out[8]: np.float64(0.2828623750247061)

In [9]: np.min(timeit.repeat("matrix_insertion(n_cells=5000)", globals=globals(), number=number))
Out[9]: np.float64(0.13297262508422136)

Describe the solution you'd like

Switch to matrix insertion, I think.

@davidorme davidorme added this to the Hydrology milestone tasks milestone Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant