Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfixes for ocean inactive top cells feature #5246

Merged
merged 5 commits into from
Nov 8, 2022

Conversation

cbegeman
Copy link
Contributor

@cbegeman cbegeman commented Oct 20, 2022

This PR fixes 2 bugs that affect the inactive top cells feature (i.e., when minLevelCell > 1). One in ocn_diagnostic_solve_ssh and the other affects the sea ice salinity flux in ocn_surface_bulk_forcing.
No changes in the solution should occur when there are no inactive top cells.

This PR also includes a cleanup of Redi. The original clunky workaround was designed to avoid non-bfb changes in BGC tracers as discussed here, and this revision introduces those potentially non-bfb changes as @maltrud determined that they are insignificant.

Fixes #5245
[BFB]

@xylar xylar added BFB PR leaves answers BFB non-BFB PR makes roundoff changes to answers. and removed BFB PR leaves answers BFB labels Oct 20, 2022
@rljacob
Copy link
Member

rljacob commented Oct 20, 2022

Can you say more about which cases this is non-BFB? Will a typical water cycle coupled case be affected?

@cbegeman
Copy link
Contributor Author

The non-BFB changes are specifically in ocean BGC tracers. So I believe a typical water cycle case with BGC tracers would be non-BFB with respect to those tracers. @maltrud would you like to weigh in here?

Copy link
Contributor

@philipwjones philipwjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks ok to me. I can't comment on the correctness of the ALE change. I generally prefer to define a (private) integer scalar kmin rather than using the minLevel array directly as an index, especially if there are multiple references. On L238,239 of the Redi routine, could that construct be: k = max(1,minLevelEdgeBot(iEdge)) ? These are optional changes though.

@mark-petersen
Copy link
Contributor

@cbegeman I tested this with gnu and intel on badger, both optimized:

  1) cmake/3.19.2   2) gcc/9.3.0   3) mvapich2/2.3   4) mkl/2020.0.4
  1) cmake/3.19.2   2) intel/19.0.4   3) intel-mpi/2019.4   4) mkl/2019.0.4

and this PR passes the nightly test suite, but fails comparisons with the master branch point on all test cases. Differences are all 1e-12 or less. This was not expected, I believe. Can you repeat that test to verify this result?

@cbegeman cbegeman force-pushed the ocn/bugfix-inactive-top-cells branch from 38431f9 to 719a98f Compare October 26, 2022 22:04
@cbegeman
Copy link
Contributor Author

@philipwjones Thanks for your suggestions for the Redi routine. I tried implementing them but I couldn't manage BFB results on all the compass PR test cases so I didn't include them here.

@cbegeman
Copy link
Contributor Author

@mark-petersen I removed the z-star ALE changes from this PR. I tried everything I could think of (after finding one needed change) and couldn't get BFB results on all the compass PR-suite test cases. I'll leave it for a separate non-BFB PR.

@cbegeman
Copy link
Contributor Author

@jonbob This PR is ready for your testing. Let me know how I can help out!

@jonbob
Copy link
Contributor

jonbob commented Oct 31, 2022

@mark-petersen and @xylar - can you please complete reviews and sign off?

Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving based on an inspection of the code and a run of the pr test suite on Chyrsalis with Intel and Intel-MPI. I use master as a baseline and did a test merge of this branch with master.

All pr tests passed except that ocean/baroclinic_channel/10km/decomp_test is failing for both the baseline and this branch (a known issue: #5219)

@rljacob
Copy link
Member

rljacob commented Nov 3, 2022

@golaz this might change answers for wcycle cases.

@xylar
Copy link
Contributor

xylar commented Nov 3, 2022

this might change answers for wcycle cases.

I could be wrong but I believe that's no longer true since part of this PR was moved to #5254. @jonbob, have you done E3SM testing since then? I assume not since you're waiting on @mark-petersen's review.

@jonbob
Copy link
Contributor

jonbob commented Nov 3, 2022

@xylar -- I can do some of the testing now, just to see if it is BFB with overnight testing. If it's not, we'll have to make some longer runs and compare

@jonbob
Copy link
Contributor

jonbob commented Nov 3, 2022

@cbegeman - did you intend this PR to no longer be non-BFB? Preliminary testing shows BFB results for:

  • SMS_D_Ld1.ne30pg2_EC30to60E2r2.WCYCL1850.chrysalis_intel.allactive-wcprod
  • ERS.ne11_oQU240.WCYCL1850NS.chrysalis_intel
  • SMS_D_Ld3.T62_oQU120.CMPASO-IAF.chrysalis_intel
  • SMS_Ld2.ne30pg2_r05_EC30to60E2r2.BGCEXP_CNTL_CNPECACNT_1850.chrysalis_intel.elm-bgcexp

The last one is a BGC test, which I thought was supposed to change answers?

@cbegeman
Copy link
Contributor Author

cbegeman commented Nov 3, 2022

@jonbob Thanks for doing that testing.

And apologies for the BFB confusion. In a previous version of master, this commit 15c8d70 did result in non-BFB changes in BGC tests, but I found that the standalone global ocean BGC test is BFB now. I didn't want to make any promises for E3SM tests but it's not totally surprising to me that these are also coming back BFB.

In the first batch of inactive top cell changes, there were some instances of compiler-dependence in whether it was BFB. So far, you, @xylar and I have all been using chrys with intel, so it might be worth checking a BGC test with another compiler. Does such a test exist for E3SM besides the one you already ran?

@jonbob
Copy link
Contributor

jonbob commented Nov 3, 2022

@cbegeman -- I also ran a test on compy with pgi and it was BFB as well:

  • ERS.ne30pg2_r05_EC30to60E2r2.GPMPAS-JRA.compy_pgi.mosart-rof_ocn_2way

@cbegeman
Copy link
Contributor Author

cbegeman commented Nov 3, 2022

@jonbob That's great!

@jonbob jonbob added BFB PR leaves answers BFB and removed non-BFB PR makes roundoff changes to answers. labels Nov 4, 2022
@jonbob
Copy link
Contributor

jonbob commented Nov 4, 2022

@cbegeman - I changed the description and labels to make this a BFB PR. Once @mark-petersen completes his review we'll merge it and see how it does on all the testing platforms

@cbegeman
Copy link
Contributor Author

cbegeman commented Nov 4, 2022

@jonbob That sounds great to me. Thank you!

@mark-petersen mark-petersen self-requested a review November 7, 2022 20:18
Copy link
Contributor

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on testing by @cbegeman @xylar and @jonbob above. Thanks!

jonbob added a commit that referenced this pull request Nov 7, 2022
…5246)

This PR fixes 2 bugs that affect the inactive top cells feature (i.e.,
when minLevelCell > 1). One in ocn_diagnostic_solve_ssh and the other
affects the sea ice salinity flux in ocn_surface_bulk_forcing. No
changes in the solution should occur when there are no inactive top
cells.

This PR also includes a cleanup of Redi. The original clunky workaround
was designed to avoid non-bfb changes in BGC tracers, and this revision
introduces those potentially non-bfb changes as @maltrud determined that
they are insignificant.

Fixes #5245
[BFB]
@jonbob
Copy link
Contributor

jonbob commented Nov 7, 2022

passes sanity testing and previous testing indicated BFB, merged to next

@jonbob jonbob merged commit 7591f33 into E3SM-Project:master Nov 8, 2022
@jonbob
Copy link
Contributor

jonbob commented Nov 8, 2022

merged to master

@cbegeman
Copy link
Contributor Author

cbegeman commented Nov 8, 2022

Thanks for your testing @jonbob!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFB PR leaves answers BFB bug fix PR mpas-ocean
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bugs in inactive top cells feature
6 participants