-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize / smooth / weights #133
Conversation
…ography smoothing, all to try and resolve stability issues and broadcasting bugs.
…from original Matlab approach. Still, the order of operations will remain changed.
…rather than passing padded arrays
…o any non-wall cell, this is consistent with the Matlab implementation (mostly). Matlab implementation tries 5 times to find a wet cell to route into, if it fails, just uses the fifth try as next step (so a dry cell). Here, we look for any wet, if they exist, use on of those, otherwise choose one of the dry ones randomly. I believe the effect of this is similar, though the implementations differ.
…reformat some code, update the smooth surface function to use any non-edge cell in conditional, rather than any non-land cell.
…ys (looped and free surface flag) to be updated. Free surf flag now begins iteration with all parcels being valid, and invalidates them once they have been found to be looping. Parcels that reach the boundary impart an exit flux on the final cell.
…y check to the sediment router.
3411b5b
to
6a716d0
Compare
…e. Also, change dry depth equality.
…nal matlab implementation, but not an exact copy. This implementation is also considerably slower than the previous implementation (but needed to be fixed). Maybe there is some way to improve the speed here. This commit also includes changes to 32bit arrays for most of the properties of the delta. This might help speed up computation a bit, but will make it simpler to track for jitted objects. Other significant code style clean up.
@ericbarefoot @elbeejay I requested your reviews on this. I don't expect you to check every detail, but maybe you can check out the changes in the This PR is a significant change in the model, including some reversions to the original Matlab approach. It will hopefully fix a lot of our issues / bugs. If we want to, we could do a release just before merging this, to tag that point in time, in case we need to revert. We could then make another release immediately following the merge. Let me know your thoughts, thanks! |
Codecov Report
@@ Coverage Diff @@
## develop #133 +/- ##
===========================================
- Coverage 81.73% 78.32% -3.41%
===========================================
Files 11 11
Lines 2152 2275 +123
===========================================
+ Hits 1759 1782 +23
- Misses 393 493 +100
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this PR is quite comprehensive, and the code structure/comments made it fairly straightforward to follow. I've split my more general questions/comments up by category (water, sed, shared). Core model code looks good to me (I didn't check over the unit tests, but am assuming those are okay). Pretty amazing how what look like relatively small "incremental" changes, compounded to produce unrealistic behavior.
water routing
- changes the behavior of _check_for_loops to now relocate looped parcels (same as before), but only disqualify the parcel from contributing to the free surface if that parcel looped at a stage > H_SL.
To clarify/confirm my understanding of this change, this is done to prevent the surface from being raised too high above sea level? Presumably due to an artificially deep cell where the topography has been lowered, and not counting looped parcels would leave the free surface too high (by retaining the old value and not updating it, or averaging other non-looped parcel paths)? Otherwise in the finalize_free_surface
, all water surfaces below sea level are set to sea level anyway.
Above the _get_water_weight_array()
function there's commented out code specifying the expected input types in the @njit
decorator. Would explicitly adding the types further speed up the jitted functions? If so I think we should open up an issue to remind us to implement that for all jitted functions sometime in the future.
In _accumulate_free_surface_walks()
, where the dH
values are set within the delta I think it could be good to either add a comment, or open an issue to indicate that the if-else
could be further split up to incorporate water surface assignment based on a backwater profile. This is something I've never played with, but was present in the Matlab implementation, @ericbarefoot have you?
sed routing
Just a comment, but I agree with your switch from 64-bit floats to 32-bit floats. This model isn't mass-conservative, so I don't think we lose anything by moving to the smaller float.
Rest of the sediment routing looks good to me, the gamma
catch was a good one. I think I had one minor comment inline for this section.
shared
In the function get_start_indices()
within shared_tools.py
the function takes inlet_weights
as an input argument. The conventional (as implemented) pyDeltaRCM model always passes in an array of 1s here, so is it worth keeping that input argument? I can see the case for it being more "flexible" for custom applications...
has_repeat_ind = False | ||
for _, iind in enumerate(relv_walk): | ||
if iind == new_ind: | ||
has_repeat_ind = True | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to speed-check this, but a less-loopy, numba-compatible check for this could be:
# determine if has a repeat ind
_dup = np.where(new_ind == relv_walk)
if np.shape(_dup)[1] > 0:
has_repeat_ind = True
else:
has_repeat_ind = False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the following gives the same result (passes consistency tests), but is slightly slower.
_dup = np.where(new_ind == relv_walk)[0]
if _dup.size > 0:
has_repeat_ind = True
else:
has_repeat_ind = False
meth | ncalls | tot | percall |
---|---|---|---|
current implementation | 6533 | 9.109 | 0.001 |
above implementation | 6533 | 9.449 | 0.001 |
I think the current implementation is slightly more readable and is marginally faster. Generally speaking, numba can do really well to optimize loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, thanks for doing the comparison. Just figured it was worth considering
weight = weight / np.sum(weight) | ||
|
||
elif np.sum(weight) == 0: | ||
# if all weights are zeros/nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the event of a nan value, the np.sum(weight)
will evaluate to nan
I think, so would be good to revise this comment and update the error message for the else
case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I will revise the comment and add a check for nans in this set of if-then
s. FWIW I removed the "safety" of the np.nansum()
because 1) it made it hard to read the logic with 0
s and np.nan
s, and 2) the nansum
is way slower than the sum
.
"""Limit change in volume to 1/4 of a cell volume. | ||
|
||
Function is used by multiple pathways in `mud_dep_ero` and `sand_dep_ero` | ||
but with different inputs for the sediment volume (:obj:`Vp`). | ||
|
||
dep_ero indicates which pathway we are in, dep==0, ero==1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe clarify in this doc-string that the volume returned is always going to be positive (even in the case of erosion)?
Or, change the function to check dep_ero
and return -1*np.minimum(Vp, fourth)
in the erosion case. This might clean up future calls in erosion scenarios where this function call is always followed by Vp_change = Vp_change * -1
- although a clean solution for this that doesn't add another if-else
check is not obvious to me...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, I've update this docstring and another to make clear that the values are positive, and making them negative should be handled in the Router-specific implementation of _deposit_or_erode
.
Yeah, I completely agree. The impact of a small change can compound and lead to drastic change in dynamics and planform.
Not exactly. There seems to be (still is, even with this PR merged) a set of issues that arise in the model when very few water parcels contribute to the free surface. If a parcel 1) loops before reaching the boundary, or 2) reaches max number of iterations before reaching the boundary, it is discounted from contributing to the free surface. This is because the downstream boundary condition is the distal sea level, so a parcel that does not reach sea level would not meet the condition for the boundary condition. Sometimes, very few water parcels reach the boundary when there is a large depression (i.e., lake) on the delta. It seems intuitive that in the real world, the flow would move into this depression and fill it with sediment (i.e., some kind of avulsion). But, if the lake is inboard and does not have a clear and wide out to the model boundary, many if not all of the parcels that go into this lake end up looping and are discounted from the free surface. Water parcels that loop still contribute to the flow discharge field, and so sediment still reaches the depression. But, the free surface is not updated to reflect the large amount of water pouring into the lake, and so water keeps pouring into the lake, and few water parcels go anywhere else in the domain. As a result, the delta kind of "shuts off" while the lake is slowly filled with sediment and the flow eventually goes somewhere else. My fix here is just to say that if the lake is at sea level, then allow the parcel to still count towards the free surface. This works, because the downstream boundary condition of assumed sea level is valid for these parcels. Parcels that loop, while they are at some elevation above sea level are still discounted from contributing to the free surface. This doesn't completely fix the root issue though, which is that sometimes very few parcels reach the boundary and the flow field kind of dries up. This seems to happen in very large domains with very low sea-level rise (i.e., low accomondation). I think a better/additional fix to consider may be to 1) continue routing water parcels until
As far as I understand it, adding the types does not speed up the actual computations, but it does speed up the actual just-in-time compilation. So instead of the first timestep taking 30 seconds, and then the next ones taking 1 second, the first could be reduced to ~5 seconds if the types are specified. This is nice, but it is definitely a confusing and intimidating block of code for someone new to python/numba who may want to implement their own methods/functions or modify our own. I think it's something to consider, and with the appropriate documentation, may be worthwhile. I'll open an issue.
I've never experimented with this either, but I'll open an issue to consider it.
Yeah, I guess I'm inclined to just leave it for now. It's kind of an "undocumented feature"... |
Thanks, this explanation makes sense. I agree that we need to think through whatever the alteration/fix is going to be, both of those ideas are interesting, I don't know which is more promising. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very happy with the new weighting methods and the new loop-checking procedure. Thanks for doing this careful detective work!
pyDeltaRCM/sed_tools.py
Outdated
'Please report error.') | ||
|
||
# correct the weights for random choice | ||
if np.nansum(weight) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to check, if there's a check to no longer have NaNs in the weighting arrays, should we just use regular np.sum()
? I think it's wildly faster actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, good call. I made this change in the water weighting, but I guess I forgot to switch it for the sediments!
@@ -330,14 +383,49 @@ def _compute_Vp_ero(self, Vp_sed, U_loc, U_ero, beta): | |||
return (Vp_sed * (U_loc**beta - U_ero**beta) / | |||
U_ero**beta) | |||
|
|||
def _limit_Vp_change(self, Vp, stage, eta, dx): | |||
def _limit_Vp_change(self, Vp, stage, eta, dx, dep_ero): | |||
"""Limit change in volume to 1/4 of a cell volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to have forgotten why this was implemented. It makes sense, but can you quickly re-explain why 1/4th was chosen? Sorryyyyy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure there's an explanation for why 0.25 specifically, but it does come from the original Matlab codes. The limiting is implemented to help numerical stability, so that large bed elevation changes cannot happen all at once. I guess 0.25 was just a reasonable choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. thank you thank you. I didn't remember it even in the original matlab either, but it seems reasonable, like I said.
@@ -93,6 +86,32 @@ def custom_ravel(tup, shape): | |||
return x + y | |||
|
|||
|
|||
@njit | |||
def custom_pad(arr): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an excellent addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 it's fast and easy!
ff4de5c
to
5dd9fce
Compare
alright cool, well thanks for the review Jay and Eric. Merging away |
@elbeejay to your question, I haven't messed around with the gradually varied flow portion of the Matlab. I have no idea how it works. |
Stabilize / smooth / weights
summary
This PR changes internal model behavior. In particular, I have unwound several changes in the model (either by us or inherited) with respect to the original Matlab code.
In particular, this PR:
_check_for_loops
to now relocate looped parcels (same as before), but only disqualify the parcel from contributing to the free surface if that parcel looped at astage > H_SL
.There are also a number of refactoring changes, which hopefully make the water routing more intuitive and less likely to break on user changes.
changes
weighting
smoothing
eta
is subtle, but apparent.More extreme, is the difference in the stage accumulated over the duration of a run between two runs in the old (left) vs new code (right).
topographic smoothing implemented by kernel convolution produced a weaker, albeit very similar effect, to the original smoothing implementation.This was wrong. I may have made a mistake when I was comparing the original implementation with the kernel implementation. It seems the results are nearly identical (0.001 m difference). I'm going to roll back this change. I.e., the topo diffusion code is the same, but the order is changed (as above).qualitative checks
visual
This PR is a breaking change to the consistent checks. Therefore, I have conducted a large number of longer simulations to highlight the improved qualities of the model, as a qualitative check.
clearly broken case
This was a condition Jay did some runs under, which produced a garbage result.
old code
new code
long high slr rate
In these runs, the sea level rise rate is ~6 mm/yr. Below is a set of four runs with identical boundary conditions. Each column is just an ensemble with different seeds.
old code
new code
two cases, muddy/sandy
These are the same parameters I used for simulations in my AGU presentation, so it is good to verify the condition set still produces good results.
muddy
sandy
f_bedload: 0.4
f_bedload: 0.75
docs bug
Additionally, this PR resolves the issues with the documentation (I think). As a result of the changes in sediment routing, the bug encountered in #107 #75 is no longer an issue.
Anecdotally, I have run the documentation building process locally about a dozen times and not encountered the issue at all.
Note, that this appears different than before, because default parameters have changed.
submerge: water refactoring
In a separate branch, I refactored water routing on top of these changes. This has been added to this PR, for simplicity, but below I document that the refactor is an identical refactor.
new without refactored water
new with refactored water
closes #30 #107 #75 #9 #48