Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize / smooth / weights #133

Merged
merged 33 commits into from
Feb 18, 2021

Conversation

amoodie
Copy link
Member

@amoodie amoodie commented Feb 2, 2021

summary

This PR changes internal model behavior. In particular, I have unwound several changes in the model (either by us or inherited) with respect to the original Matlab code.

In particular, this PR:

  1. modifies (and separates) the water and sediment routing, to restore the weighting choices
  2. changes the order of topographic diffusion to occur after sand routing, before mud routing
  3. changes the behavior of _check_for_loops to now relocate looped parcels (same as before), but only disqualify the parcel from contributing to the free surface if that parcel looped at a stage > H_SL.

There are also a number of refactoring changes, which hopefully make the water routing more intuitive and less likely to break on user changes.

changes

weighting

  • Water routing should not be allowed to remain at the same cell on each step, but in the previous model it could.
  • Sediment routing should not depend on the stage, but in the previous model code it did.
  • Both water and sediment routing did not adequately handle all edge cases. I think I have fixed these issues in this PR.

smoothing

  • Water surface smoothing previously mishandled arrays in memory, changing values of the array while smoothing, leading to a less than smooth outcome. The difference in the new vs old method applied to the same eta is subtle, but apparent.
    image More extreme, is the difference in the stage accumulated over the duration of a run between two runs in the old (left) vs new code (right).
    image
  • topographic smoothing was previously after all sediment routing. It is unclear why this change was made (see Topo diffusion before or after mud routing? #30), but I have reversed the change, so the smoothing now occurs immediately following sand routing.
  • topographic smoothing implemented by kernel convolution produced a weaker, albeit very similar effect, to the original smoothing implementation. This was wrong. I may have made a mistake when I was comparing the original implementation with the kernel implementation. It seems the results are nearly identical (0.001 m difference). I'm going to roll back this change. I.e., the topo diffusion code is the same, but the order is changed (as above).

qualitative checks

visual

This PR is a breaking change to the consistent checks. Therefore, I have conducted a large number of longer simulations to highlight the improved qualities of the model, as a qualitative check.

clearly broken case

This was a condition Jay did some runs under, which produced a garbage result.

old code new code
image eta_00285
out_dir: '/scratch/300yrs_nobedrock_new'
Length: 10000.0
Width: 18000.0
# should be about 300 years simulated
timesteps: 10000
# 50m per cell
dx: 50.0
# fix seed
seed: 0
# number of water loops
itermax: 2
# land buffer is 3 cells deep 
L0_meters: 150.0
# characteristic slope, little steeper than default of 0.0002
S0: 0.0003
# number of parcels
Np_water: 3000
Np_sed: 3000
# input velocity 1 m/s
u0: 1.0
# channel width of 5 cells
N0_meters: 250.0
# initial channel depth of 5
h0: 5.0
# 5 mm/yr SLR
SLR: 6e-9
# 50/50 input sediment fraction
f_bedload: 0.5
# do less smoothing than default of 5
Nsmooth: 3
# save approx every 10 bankfull days (1 yrs worth)
save_dt: 864000
# save strata
save_strata: True
# save eta figures
save_eta_figs: True
# save grids that will enable a 'soft' checkpoint
# aka output can be used as initial condition for another run
save_eta_grids: True
save_velocity_grids: True
save_depth_grids: True
save_stage_grids: True

long high slr rate

In these runs, the sea level rise rate is ~6 mm/yr. Below is a set of four runs with identical boundary conditions. Each column is just an ensemble with different seeds.

old code new code
image eta_00454
image eta_00454
image eta_00454
image eta_00454
out_dir: '/scratch/diff_test'
Length: 10000.
Width: 20000.
dx: 100
verbose: 1
N0_meters: 400
Np_water: 2000
Np_sed: 2000
maxiter: 2
C0_percent: 0.05
SLR: 2e-9
h0: 3
u0: 1.0
save_eta_figs: True
save_eta_grids: True
save_depth_grids: True
save_velocity_grids: True
save_strata: True
save_dt: 864000
f_bedload: 0.3

ensemble: 4

two cases, muddy/sandy

These are the same parameters I used for simulations in my AGU presentation, so it is good to verify the condition set still produces good results.

muddy sandy
f_bedload: 0.4 f_bedload: 0.75
https://user-images.githubusercontent.com/8801322/107657337-b41b4d80-6c4a-11eb-9aab-fc77660ddbdd.mp4 https://user-images.githubusercontent.com/8801322/107657771-16744e00-6c4b-11eb-9399-2ccbdd0a0e00.mp4
out_dir: '/scratch/sandy_muddy_stack'
Length: 5000.
Width: 10000.
dx: 50
verbose: 1
N0_meters: 500
Np_water: 2000
Np_sed: 2000
C0_percent: 0.05
SLR: 1.5e-8
h0: 2
u0: 1.1
coeff_U_dep_mud: 0.5
coeff_U_ero_mud: 1.5
coeff_U_ero_sand: 1.05
save_eta_figs: True
save_eta_grids: True
save_stage_grids: True
save_depth_grids: True
save_discharge_grids: True
save_velocity_grids: True
save_sedflux_grids: True
save_strata: True
save_dt: 432000
save_checkpoint: True
seed: 4070530382

matrix:
  f_bedload:
    - 0.4
    - 0.75

docs bug

Additionally, this PR resolves the issues with the documentation (I think). As a result of the changes in sediment routing, the bug encountered in #107 #75 is no longer an issue.

Anecdotally, I have run the documentation building process locally about a dozen times and not encountered the issue at all.
image

Note, that this appears different than before, because default parameters have changed.

submerge: water refactoring

In a separate branch, I refactored water routing on top of these changes. This has been added to this PR, for simplicity, but below I document that the refactor is an identical refactor.

new without refactored water new with refactored water
image image

closes #30 #107 #75 #9 #48

Andrew Moodie added 17 commits February 5, 2021 15:16
…ography smoothing, all to try and resolve stability issues and broadcasting bugs.
…from original Matlab approach. Still, the order of operations will remain changed.
…o any non-wall cell, this is consistent with the Matlab implementation (mostly). Matlab implementation tries 5 times to find a wet cell to route into, if it fails, just uses the fifth try as next step (so a dry cell). Here, we look for any wet, if they exist, use on of those, otherwise choose one of the dry ones randomly. I believe the effect of this is similar, though the implementations differ.
…reformat some code, update the smooth surface function to use any non-edge cell in conditional, rather than any non-land cell.
…ys (looped and free surface flag) to be updated. Free surf flag now begins iteration with all parcels being valid, and invalidates them once they have been found to be looping. Parcels that reach the boundary impart an exit flux on the final cell.
@amoodie amoodie force-pushed the stabilize_smooth_weights branch from 3411b5b to 6a716d0 Compare February 8, 2021 16:35
Andrew Moodie added 7 commits February 8, 2021 14:46
…nal matlab implementation, but not an exact copy. This implementation is also considerably slower than the previous implementation (but needed to be fixed). Maybe there is some way to improve the speed here. This commit also includes changes to 32bit arrays for most of the properties of the delta. This might help speed up computation a bit, but will make it simpler to track for jitted objects. Other significant code style clean up.
@amoodie amoodie added the enhancement New feature or request label Feb 11, 2021
@amoodie amoodie marked this pull request as ready for review February 11, 2021 17:46
@amoodie
Copy link
Member Author

amoodie commented Feb 11, 2021

@ericbarefoot @elbeejay I requested your reviews on this. I don't expect you to check every detail, but maybe you can check out the changes in the water_tools and the sed_tools files, to see that you generally approve of the changes, and you think they make sense.

This PR is a significant change in the model, including some reversions to the original Matlab approach. It will hopefully fix a lot of our issues / bugs.

If we want to, we could do a release just before merging this, to tag that point in time, in case we need to revert. We could then make another release immediately following the merge. Let me know your thoughts, thanks!

@codecov-io
Copy link

codecov-io commented Feb 11, 2021

Codecov Report

Merging #133 (1e75e28) into develop (fdd3b76) will decrease coverage by 3.40%.
The diff coverage is 40.25%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #133      +/-   ##
===========================================
- Coverage    81.73%   78.32%   -3.41%     
===========================================
  Files           11       11              
  Lines         2152     2275     +123     
===========================================
+ Hits          1759     1782      +23     
- Misses         393      493     +100     
Impacted Files Coverage Δ
pyDeltaRCM/preprocessor.py 91.07% <0.00%> (-1.72%) ⬇️
pyDeltaRCM/sed_tools.py 33.91% <16.98%> (-2.81%) ⬇️
pyDeltaRCM/water_tools.py 52.38% <31.48%> (-10.64%) ⬇️
pyDeltaRCM/shared_tools.py 51.72% <33.33%> (+1.72%) ⬆️
pyDeltaRCM/debug_tools.py 85.85% <72.22%> (-7.99%) ⬇️
pyDeltaRCM/init_tools.py 99.45% <100.00%> (-0.28%) ⬇️
pyDeltaRCM/iteration_tools.py 96.22% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fdd3b76...1e75e28. Read the comment docs.

@amoodie amoodie mentioned this pull request Feb 14, 2021
Copy link
Member

@elbeejay elbeejay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this PR is quite comprehensive, and the code structure/comments made it fairly straightforward to follow. I've split my more general questions/comments up by category (water, sed, shared). Core model code looks good to me (I didn't check over the unit tests, but am assuming those are okay). Pretty amazing how what look like relatively small "incremental" changes, compounded to produce unrealistic behavior.

water routing

  1. changes the behavior of _check_for_loops to now relocate looped parcels (same as before), but only disqualify the parcel from contributing to the free surface if that parcel looped at a stage > H_SL.

To clarify/confirm my understanding of this change, this is done to prevent the surface from being raised too high above sea level? Presumably due to an artificially deep cell where the topography has been lowered, and not counting looped parcels would leave the free surface too high (by retaining the old value and not updating it, or averaging other non-looped parcel paths)? Otherwise in the finalize_free_surface, all water surfaces below sea level are set to sea level anyway.

Above the _get_water_weight_array() function there's commented out code specifying the expected input types in the @njit decorator. Would explicitly adding the types further speed up the jitted functions? If so I think we should open up an issue to remind us to implement that for all jitted functions sometime in the future.

In _accumulate_free_surface_walks(), where the dH values are set within the delta I think it could be good to either add a comment, or open an issue to indicate that the if-else could be further split up to incorporate water surface assignment based on a backwater profile. This is something I've never played with, but was present in the Matlab implementation, @ericbarefoot have you?

sed routing

Just a comment, but I agree with your switch from 64-bit floats to 32-bit floats. This model isn't mass-conservative, so I don't think we lose anything by moving to the smaller float.

Rest of the sediment routing looks good to me, the gamma catch was a good one. I think I had one minor comment inline for this section.

shared

In the function get_start_indices() within shared_tools.py the function takes inlet_weights as an input argument. The conventional (as implemented) pyDeltaRCM model always passes in an array of 1s here, so is it worth keeping that input argument? I can see the case for it being more "flexible" for custom applications...

Comment on lines +719 to +723
has_repeat_ind = False
for _, iind in enumerate(relv_walk):
if iind == new_ind:
has_repeat_ind = True
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to speed-check this, but a less-loopy, numba-compatible check for this could be:

# determine if has a repeat ind
_dup = np.where(new_ind == relv_walk)
if np.shape(_dup)[1] > 0:
     has_repeat_ind = True
else:
     has_repeat_ind = False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the following gives the same result (passes consistency tests), but is slightly slower.

_dup = np.where(new_ind == relv_walk)[0]
if _dup.size > 0:
    has_repeat_ind = True
else:
    has_repeat_ind = False
meth ncalls tot percall
current implementation 6533 9.109 0.001
above implementation 6533 9.449 0.001

I think the current implementation is slightly more readable and is marginally faster. Generally speaking, numba can do really well to optimize loops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks for doing the comparison. Just figured it was worth considering

weight = weight / np.sum(weight)

elif np.sum(weight) == 0:
# if all weights are zeros/nan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the event of a nan value, the np.sum(weight) will evaluate to nan I think, so would be good to revise this comment and update the error message for the else case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I will revise the comment and add a check for nans in this set of if-thens. FWIW I removed the "safety" of the np.nansum() because 1) it made it hard to read the logic with 0s and np.nans, and 2) the nansum is way slower than the sum.

"""Limit change in volume to 1/4 of a cell volume.

Function is used by multiple pathways in `mud_dep_ero` and `sand_dep_ero`
but with different inputs for the sediment volume (:obj:`Vp`).

dep_ero indicates which pathway we are in, dep==0, ero==1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe clarify in this doc-string that the volume returned is always going to be positive (even in the case of erosion)?

Or, change the function to check dep_ero and return -1*np.minimum(Vp, fourth) in the erosion case. This might clean up future calls in erosion scenarios where this function call is always followed by Vp_change = Vp_change * -1 - although a clean solution for this that doesn't add another if-else check is not obvious to me...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, I've update this docstring and another to make clear that the values are positive, and making them negative should be handled in the Router-specific implementation of _deposit_or_erode.

@amoodie amoodie closed this Feb 15, 2021
@amoodie amoodie reopened this Feb 15, 2021
@amoodie
Copy link
Member Author

amoodie commented Feb 15, 2021

Pretty amazing how what look like relatively small "incremental" changes, compounded to produce unrealistic behavior.

Yeah, I completely agree. The impact of a small change can compound and lead to drastic change in dynamics and planform.

To clarify/confirm my understanding of this change, this is done to prevent the surface from being raised too high above sea level? Presumably due to an artificially deep cell where the topography has been lowered, and not counting looped parcels would leave the free surface too high (by retaining the old value and not updating it, or averaging other non-looped parcel paths)? Otherwise in the finalize_free_surface, all water surfaces below sea level are set to sea level anyway.

Not exactly. There seems to be (still is, even with this PR merged) a set of issues that arise in the model when very few water parcels contribute to the free surface. If a parcel 1) loops before reaching the boundary, or 2) reaches max number of iterations before reaching the boundary, it is discounted from contributing to the free surface. This is because the downstream boundary condition is the distal sea level, so a parcel that does not reach sea level would not meet the condition for the boundary condition.

Sometimes, very few water parcels reach the boundary when there is a large depression (i.e., lake) on the delta. It seems intuitive that in the real world, the flow would move into this depression and fill it with sediment (i.e., some kind of avulsion). But, if the lake is inboard and does not have a clear and wide out to the model boundary, many if not all of the parcels that go into this lake end up looping and are discounted from the free surface. Water parcels that loop still contribute to the flow discharge field, and so sediment still reaches the depression. But, the free surface is not updated to reflect the large amount of water pouring into the lake, and so water keeps pouring into the lake, and few water parcels go anywhere else in the domain. As a result, the delta kind of "shuts off" while the lake is slowly filled with sediment and the flow eventually goes somewhere else.

My fix here is just to say that if the lake is at sea level, then allow the parcel to still count towards the free surface. This works, because the downstream boundary condition of assumed sea level is valid for these parcels. Parcels that loop, while they are at some elevation above sea level are still discounted from contributing to the free surface.

This doesn't completely fix the root issue though, which is that sometimes very few parcels reach the boundary and the flow field kind of dries up. This seems to happen in very large domains with very low sea-level rise (i.e., low accomondation). I think a better/additional fix to consider may be to 1) continue routing water parcels until Np_water parcels have actually reached the boundary (or ~75% of them anyway), or 2) somehow renormalize the parcels that did make it to the boundary to represent as though all Np_water parcels did reach the boundary. Things to consider...

Above the _get_water_weight_array() function there's commented out code specifying the expected input types in the @njit decorator. Would explicitly adding the types further speed up the jitted functions? If so I think we should open up an issue to remind us to implement that for all jitted functions sometime in the future.

As far as I understand it, adding the types does not speed up the actual computations, but it does speed up the actual just-in-time compilation. So instead of the first timestep taking 30 seconds, and then the next ones taking 1 second, the first could be reduced to ~5 seconds if the types are specified. This is nice, but it is definitely a confusing and intimidating block of code for someone new to python/numba who may want to implement their own methods/functions or modify our own. I think it's something to consider, and with the appropriate documentation, may be worthwhile. I'll open an issue.

In _accumulate_free_surface_walks(), where the dH values are set within the delta I think it could be good to either add a comment, or open an issue to indicate that the if-else could be further split up to incorporate water surface assignment based on a backwater profile. This is something I've never played with, but was present in the Matlab implementation, @ericbarefoot have you?

I've never experimented with this either, but I'll open an issue to consider it.

In the function get_start_indices() within shared_tools.py the function takes inlet_weights as an input argument. The conventional (as implemented) pyDeltaRCM model always passes in an array of 1s here, so is it worth keeping that input argument? I can see the case for it being more "flexible" for custom applications...

Yeah, I guess I'm inclined to just leave it for now. It's kind of an "undocumented feature"...

@elbeejay
Copy link
Member

My fix here is just to say that if the lake is at sea level, then allow the parcel to still count towards the free surface. This works, because the downstream boundary condition of assumed sea level is valid for these parcels. Parcels that loop, while they are at some elevation above sea level are still discounted from contributing to the free surface.

This doesn't completely fix the root issue though, which is that sometimes very few parcels reach the boundary and the flow field kind of dries up. This seems to happen in very large domains with very low sea-level rise (i.e., low accomondation). I think a better/additional fix to consider may be to 1) continue routing water parcels until Np_water parcels have actually reached the boundary (or ~75% of them anyway), or 2) somehow renormalize the parcels that did make it to the boundary to represent as though all Np_water parcels did reach the boundary. Things to consider...

Thanks, this explanation makes sense. I agree that we need to think through whatever the alteration/fix is going to be, both of those ideas are interesting, I don't know which is more promising.

Copy link
Collaborator

@ericbarefoot ericbarefoot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very happy with the new weighting methods and the new loop-checking procedure. Thanks for doing this careful detective work!

'Please report error.')

# correct the weights for random choice
if np.nansum(weight) == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to check, if there's a check to no longer have NaNs in the weighting arrays, should we just use regular np.sum()? I think it's wildly faster actually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, good call. I made this change in the water weighting, but I guess I forgot to switch it for the sediments!

@@ -330,14 +383,49 @@ def _compute_Vp_ero(self, Vp_sed, U_loc, U_ero, beta):
return (Vp_sed * (U_loc**beta - U_ero**beta) /
U_ero**beta)

def _limit_Vp_change(self, Vp, stage, eta, dx):
def _limit_Vp_change(self, Vp, stage, eta, dx, dep_ero):
"""Limit change in volume to 1/4 of a cell volume.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to have forgotten why this was implemented. It makes sense, but can you quickly re-explain why 1/4th was chosen? Sorryyyyy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's an explanation for why 0.25 specifically, but it does come from the original Matlab codes. The limiting is implemented to help numerical stability, so that large bed elevation changes cannot happen all at once. I guess 0.25 was just a reasonable choice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. thank you thank you. I didn't remember it even in the original matlab either, but it seems reasonable, like I said.

@@ -93,6 +86,32 @@ def custom_ravel(tup, shape):
return x + y


@njit
def custom_pad(arr):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent addition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 it's fast and easy!

@amoodie
Copy link
Member Author

amoodie commented Feb 18, 2021

alright cool, well thanks for the review Jay and Eric. Merging away

@amoodie amoodie merged commit 586148b into DeltaRCM:develop Feb 18, 2021
@ericbarefoot
Copy link
Collaborator

@elbeejay to your question, I haven't messed around with the gradually varied flow portion of the Matlab. I have no idea how it works.

amoodie added a commit to amoodie/pyDeltaRCM that referenced this pull request Feb 25, 2021
@amoodie amoodie deleted the stabilize_smooth_weights branch April 28, 2022 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
4 participants