Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSD bug fixes #424

Merged
merged 11 commits into from
Mar 15, 2023
Merged

FSD bug fixes #424

merged 11 commits into from
Mar 15, 2023

Conversation

dabail10
Copy link
Contributor

@dabail10 dabail10 commented Jan 11, 2023

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    These are the latest FSD bug fixes from @cmbitz
  • Developer(s):
    @cmbitz (C. Bitz)
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below
    https://github.com/CICE-Consortium/Test-Results/wiki/icepack_by_hash_forks#828bdc29d9341fd1b03861815e56b55718598d06
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on CICE or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:
    These are answer changing bug fixes that impact only the FSD cases (tr_fsd = .true.) There is also a CICE tag to come with this.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for fixing these issues.

@@ -276,6 +276,7 @@ module icedrv_flux
real (kind=dbl_kind), dimension (nx), public :: &
rside , & ! fraction of ice that melts laterally
fside , & ! lateral heat flux (W/m^2)
wlat , & ! lateral heat rate (m/s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment should be lateral melt rate

@eclare108213
Copy link
Contributor

@cmbitz @lettie-roach
This is great. Before I approve the PR, I'd like to see some sort of documentation of exactly what this does, both in terms of the code/physics and in terms of the simulation. For the latter, do you have some before-and-after model output that could be posted here? In particular, do you have a sense of how big the differences are, e.g. are they "climate changing"? Maybe we should run a QC test in CICE.

Regarding the nature of the code/physics changes, please, would one of you put a comment here in this PR (or maybe in the template at the top) that describes what you did? It looks like an important piece is using the lateral melting rate directly rather than fside, which is somehow incorrect (how?). Are there other changes beyond that, which contribute to the non-bit-for-bit-ness? E.g. I'm not sure what's happening with the welding. No need for a book, just a few sentences to clarify what the problem is that you're fixing. Thank you!

@cmbitz
Copy link
Contributor

cmbitz commented Jan 11, 2023

I hope I've attached a figure showing the consequences to the IcePack ice thickness distribution (aicen) for a 5 category simulation, SST, and fhocn. These simulations have a slab ocean. I can't upload netcdf files (85MB) here but am happy to provide them, or can provide a log file if you like.

I will attempt to test in CICE next. It will take me a few days though.

FSD_bugfixes_vs_original

@cmbitz
Copy link
Contributor

cmbitz commented Jan 11, 2023

The changes only affect the FSD (when tr_fsd = .true.). The gist of it is that the original code attempted to estimate the fside (the lateral heat flux) in subroutine frzmlt_bottom_lateral, just like is done for the non FSD code. However, there was an error, where fside was computed as a function of the enthalpy sum across all layers and categories rather than the mean. The variable fside computed in frzmlt_bottom_lateral is unique for each thickness category, but in fact fside should be unique for each floe bin too. So even if the bug were fixed, it would be an imprecise solution. The variables fside eventually makes its way to subroutine lateral_melt where fside is used to reduce the sea ice concentration.

Because frzmlt_bottom_lateral doesn't know about the FSD, it made more sense to me to redesign this code to ONLY compute wlat (lateral melt rate) and send wlat to lateral_melt, where we can compute rside and fside uniquely for each FSD bin and thickness category in the FSD.

There is also a very minor bug fix to the welding scheme, where a factor of 1/2 was present for a very minor part of the calculation, where it should have been a 1. Finally there was a line out of order a bit later in the weld scheme that effectively left out the intended "FSD clean-up". These two tiny modifications had a pretty minimal impact on Icepack. We recommend them though since the code now matches the equations, and it naturally has total loss = total gain from the numerics, while previously this constraint had to be imposed.

@eclare108213
Copy link
Contributor

Thank you for the explanation - this makes a lot of sense. If you have trouble setting up and running the QC test, let us know. It's not obvious to me whether this is a climate-changing alteration, so I'm curious to see the QC outcome (comparing before and after the code mods, both cases with FSD turned on). The SST difference looks big!

@dabail10
Copy link
Contributor Author

I have added the icepack warnings calls to replace the print statements. Let me know if you see problems.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one minor change. Before approving and merging, I would like to see the QC test. CC let us know if you need help with that.

Because frzmlt_bottom_lateral doesn't know about the FSD, it made more sense to me to redesign this code to ONLY compute wlat (lateral melt rate) and send wlat to lateral_melt, where we can compute rside and fside uniquely for each FSD bin and thickness category in the FSD.

Does this mean that fside no longer needs to be sent out from frzmlt_bottom_lateral (maybe for non-FSD runs?) and/or from lateral_melt (maybe for history)?

@@ -2012,7 +2015,7 @@ subroutine icepack_step_therm2 (dt, ncat, nltrcr, &
Tf , & ! freezing temperature (C)
sss , & ! sea surface salinity (ppt)
rside , & ! fraction of ice that melts laterally
fside , & ! lateral heat flux (W/m^2)
wlat , & ! lateral melt rate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add units (m/s)

@lettie-roach
Copy link
Contributor

Does this mean that fside no longer needs to be sent out from frzmlt_bottom_lateral (maybe for non-FSD runs?) and/or from lateral_melt (maybe for history)?

I think that's right, we could remove it, unless we want to add it to the history file? It's not currently in the history file, but could be useful to have in there

@apcraig
Copy link
Contributor

apcraig commented Jan 30, 2023

Are we still waiting for a QC test? Could I help with that?

@dabail10
Copy link
Contributor Author

CC is still working on the CICE changes and then I am happy to run the QC test.

@dabail10
Copy link
Contributor Author

Why is this a conflict?

@apcraig
Copy link
Contributor

apcraig commented Jan 31, 2023

Your recent changes didn't create the conflict, they were there before. Must be associated with a recent PR to main.

@dabail10
Copy link
Contributor Author

Here are the plots and results from the QC test:

(npl) [dbailey@cheyenne3 CICE_CC]> ./cice.t-test.py /glade/scratch/dbailey/CICE_RUNS/cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdmods /glade/scratch/dbailey/CICE_RUNS/cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdbase
INFO:main:Running QC test on the following directories:
INFO:main: /glade/scratch/dbailey/CICE_RUNS/cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdmods
INFO:main: /glade/scratch/dbailey/CICE_RUNS/cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdbase
INFO:main:Number of files: 1825
INFO:main:2 Stage Test Passed
INFO:main:Quadratic Skill Test Failed for Northern Hemisphere
INFO:main:Quadratic Skill Test Passed for Southern Hemisphere
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/glade/scratch/dbailey/runtime-dbailey'
INFO:main:Creating map of the data (ice_thickness_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdmod.png)
INFO:main:Creating map of the data (ice_thickness_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdbase.png)
INFO:main:Creating map of the data (ice_thickness_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdmod_minus_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc.fsdbase.png)
INFO:main:
ERROR:main:Quality Control Test FAILED

ice_thickness_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc fsdbase
ice_thickness_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc fsdmod
ice_thickness_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc fsdmod_minus_cheyenne_intel_smoke_gx3_40x1_fsd12ww3_medium_qc fsdbase

@dabail10
Copy link
Contributor Author

Not sure why that was a conflict, but I have fixed it now.

@dabail10
Copy link
Contributor Author

There was some spacing changes in main I guess. I think I have resolved this?

@eclare108213
Copy link
Contributor

I think the plan here is to implement these changes (or some of them) so that they are backward compatible, since the new approach isn't completely nailed down yet. Does that make sense in this case, or is this PR a bug that needs to be fixed regardless?

@dabail10
Copy link
Contributor Author

I just updated this so that if tr_fsd is .false. then it is backwardly compatible.

@apcraig
Copy link
Contributor

apcraig commented Feb 15, 2023

@dabail10, I think more needs to be done. For instance, sss and wlat both need to become optional. And all the old code associated with sss needs to come back. And there needs to be a check if sss or wlat is passed (only one should be), etc. That should trigger the calculation for now. In general, that is NOT how we want this to work. If we want to support both sss and wlat longer term, we need a flag to choose the option. But for now, I think we can allow a check of arguments to see which calculation is done.

@eclare108213
Copy link
Contributor

Okay, I see there's some confusion here about what needs to be backwards compatible. If tr_fsd is false, then definitely it needs to be backwards compatible for that case, but if tr_fsd is true, should these changes be considered a bug fix or should we keep the previous approach around for the time being, until the new one is confirmed as working? The confusion may be all mine -- I'd like to better understand what is changing and why. I think CC said there were some changes that are BFB, others not, and bug fixes in the mix. Not sure how all that maps out in Icepack and CICE.

@dabail10
Copy link
Contributor Author

I am definitely confused as well! I can certainly make it completely backwards compatible, but I was thinking that it was going to be answer changing when tr_fsd = .true.

@apcraig
Copy link
Contributor

apcraig commented Feb 26, 2023

I think the only answer changes are when tr_fsd=T, correct?

I thought the concern was that this changed the public icepack interfaces (sss->wlat). But upon closer inspection, that doesn't seem to be the case. sss->wlat is changed in lateral_melt (internal subroutine). And wlat is ADDED to icepack_step_therm[1,2] (public subroutines) and frzmlt_bottom_lateral (internal subroutine). So I believe this is fully backwards compatible except when fsd is turned on.

This following block checks the fsd changes for the new wlat argument in icepack_step_therm[1,2], so that's great. And it's in the icepack_chkoptargflag if-block so that's perfect.

          if (tr_fsd) then
             if (.not.(present(wlat))) then
                call icepack_warnings_add(subname//' error in FSD arguments, tr_fsd=T')
                call icepack_warnings_setabort(.true.,__FILE__,__LINE__)
                return
             endif
          endif

I think this should be backwards compatible except for fsd runs. But we need to change CICE to be able to test fsd there. But users in general don't need to change their code unless they want to run with fsd.

I guess the slightly bigger question is whether fsd will need to change again. We do have other options including

  • if tr_fsd=T and wlat is not passed in, use the sss implementation. That would make fsd backwards compatible, but results may not be very good.
  • wait to implement the wlat argument until the new fsd implementation is settled. But this delays a fix we know improves fsd.

The CICE update, CICE-Consortium/CICE#813, adds wlat to the icepack_step_therm[1,2] call. And that looks fine. It's intent(out) in therm1 and intent(in) in therm2. wlat is unset when passed into therm1 the first time, I wonder if we should set wlat(:)=c0 in CICE at initialization? I'll add that comment in CICE-Consortium/CICE#813.

I think what we have here is quite reasonable and that we should merge this PR. Was testing done on the most recent version?

@apcraig
Copy link
Contributor

apcraig commented Feb 27, 2023

I ran a full test suite of this version of Icepack and CICE-Consortium/CICE#813 plus this version of Icepack. All tests pass. The fsd results change answers in both Icepack and CICE, as expected. However, the snwgrain tests also changed answer in CICE. You can actually see this in the original tests done here, see "cheyenne intel restart gx3 8x2 icdefault snwgrain snwitdrdg" in https://github.com/CICE-Consortium/Test-Results/wiki/8d8a2d7db8.cheyenne.intel.23-02-15.162343.0.

I think we need to understand why snwgrain is changing answers in CICE with these mods and document that as well.

@apcraig
Copy link
Contributor

apcraig commented Feb 27, 2023

@dabail10
Copy link
Contributor Author

wlat is initialized to c0 in frzmlt_bottom_lateral where it is intent(out). I think this covers it. I will let @cmbitz chime in terms of future potential changes here.

@dabail10
Copy link
Contributor Author

dabail10 commented Mar 3, 2023

I'm not sure I understand this. Are the answers different with tr_fsd = .true. and snwgrain = .true.? Or just when snwgrain = .true.? Is it possible that I reverted some of the snwgrain changes from the recent commit?

@apcraig
Copy link
Contributor

apcraig commented Mar 3, 2023

@dabail10, The snwgrain results change answers without fsd. I also was wondering if snwgrain changes were reverted but I don't think they were. That's the first thing I checked, but maybe we should look closer. I think we need to understand this before we merge.

@dabail10
Copy link
Contributor Author

dabail10 commented Mar 3, 2023

There was a commit with the change in the optional arguments with rsnow. Did this change answers? If so, this is not in the FSDmods version.

@dabail10
Copy link
Contributor Author

dabail10 commented Mar 3, 2023

Maybe I should fast-forward the FSDmods branch?

@apcraig
Copy link
Contributor

apcraig commented Mar 3, 2023

I don't think it's the most recent merges either. I tested before that and the original test results from 2 months ago also show the same. If you look at the failures in the original test suite, it shows the snwgrain changes answers.

@apcraig
Copy link
Contributor

apcraig commented Mar 3, 2023

Just to clarify, not 2 months ago, 2 weeks ago, these test results, https://github.com/CICE-Consortium/Test-Results/wiki/8d8a2d7db8.cheyenne.intel.23-02-15.162343.0.

@apcraig
Copy link
Contributor

apcraig commented Mar 13, 2023

I think I understand why the snow results change answers, the Consortium CICE version we're comparing to is a bit behind with respect to Icepack. I will put in a PR to update Icepack in CICE then restest these changes again against the latest CICE main.

@apcraig
Copy link
Contributor

apcraig commented Mar 14, 2023

I ran a full test suite of these changes in CICE against the current CICE version with Icepack fully updated and everything is bit-for-bit except the fsd changes. So now we are getting the results we expected.

7641 measured results of 7641 total results
7530 of 7641 tests PASSED
9 of 7641 tests PENDING
0 of 7641 tests MISSING data
102 of 7641 tests FAILED

I think everything is fine and this is ready to merge. Unless anyone disagrees, I'll merge this today or tomorrow. Then we'll want to update the CICE fsd PR to include the latest Icepack and merge that PR as well. @eclare108213 @dabail10 @cmbitz, just speak up if we should delay further. Thanks.

@dabail10
Copy link
Contributor Author

This is great news. I would like to see this go in asap.

Dave

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of the outstanding questions for this PR have been answered, including backwards compatibility (it is for tr_fsd=F, not for tr_fsd=T) and QC when tr_fsd=T (results are significantly different). This PR includes bug fixes that do need to be merged.

@apcraig apcraig merged commit a4779cc into CICE-Consortium:main Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants