Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CPP implementation #490

Merged
merged 19 commits into from
Jul 31, 2020
Merged

Update CPP implementation #490

merged 19 commits into from
Jul 31, 2020

Conversation

apcraig
Copy link
Contributor

@apcraig apcraig commented Jul 17, 2020

PR checklist

  • Short (1 sentence) summary of your PR:
    Update CPP implementation

  • Developer(s):
    apcraig

  • Suggest PR reviewers from list in the column to the right.

  • Please copy the PR test results link or provide a summary of testing completed below.
    Full test suite passed on 4 compilers on gordon.
    https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#e96a7c89efd9b62c6354fc5ce85abd186a4387f6
    Full test suite on cheyenne,
    https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#1f6ce894f5dd396fd1a370612ed5a45a3be01b6c
    the alt01 and alt03 tests fail with iobinary because those tests have use_bathymetry "true" which requires a netcdf file. The new netcdf checks catch this error.

  • How much do the PR code changes differ from the unmodified code?

    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?

    • Yes
    • No
  • Does this PR add any new test cases?

    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)

    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

  • Rename ncdf to USE_NETCDF (ncdf still works) and update USE_NETCDF implementation

  • Update NO_I8

  • Update NO_R16

  • Remove popcice

  • Convert ORCA_GRID CPP to orca_halogrid namelist

  • Convert a RASM_MODS CPP to bathymetry_format namelist to support reading a pop vertical grid file

  • Convert gather_scatter_barrier CPP to add_mpi_barriers namelist

  • Document

See #288

apcraig added 2 commits July 17, 2020 17:45
Rename ncdf to USE_NETCDF (ncdf still works) and update USE_NETCDF implementation
Update NO_I8
Update NO_R16
Remove popcice
Convert a RASM_MODS CPP to bathymetry_format namelist to support reading a pop vertical grid file
Convert gather_scatter_barrier CPP to add_mpi_barriers namelist
Document
@apcraig
Copy link
Contributor Author

apcraig commented Jul 17, 2020

I will test on cheyenne when it's available before merging. I want to confirm all the pio tests are working.

@apcraig apcraig marked this pull request as ready for review July 17, 2020 17:56
@apcraig apcraig mentioned this pull request Jul 17, 2020
@dabail10
Copy link
Contributor

Do we ever compile the code without MPI? If so, then the add_mpi_barriers namelist flag might create a problem.

@apcraig
Copy link
Contributor Author

apcraig commented Jul 17, 2020

The add_mpi_barriers only used in infrastructure/comm/mpi/ice_gather_scatter.F90 which assumes MPI exists. If there is no mpi, then the infrastructure/comm/serial code should be used and no mpi calls are there. For MPI, we split the code into serial or mpi and compile only what we want. I think we should continue to discuss whether we like this approach or whether it's better to merge the serial and MPI versions and then add a NO_MPI CPP. There are pluses and minuses to both approaches.

@dabail10
Copy link
Contributor

Sounds good with the barriers. The concern is that the serial code may use serial MPI or no MPI.

@apcraig
Copy link
Contributor Author

apcraig commented Jul 17, 2020

Why would a coupled model that has MPI ever choose to use the serial comm code? What is "serial MPI"? The serial code in CICE exists for cases/configurations where there is no MPI. If there is MPI, even if running on 1 pe or some other mode, users should use the mpi comm code. I would propose we get rid of all the "coupled" code in cicecore/cicedynB/infrastructure/comm/serial. Does CESM or any coupled model ever use the serial code when running with MPI?

@dabail10
Copy link
Contributor

I believe serial MPI is just a way to use the MPI interfaces, but on one processor, or perhaps with OMP threads. So a model like CESM that can run on cheyenne or on a laptop needs to be flexible in that way. Perhaps this serial MPI idea is never used. I'd like to hear from others if they use it. I can do some asking to see.

@apcraig
Copy link
Contributor Author

apcraig commented Jul 20, 2020

The cheyenne results are here,

https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#1f6ce894f5dd396fd1a370612ed5a45a3be01b6c

2 tests are failing per compiler, that is the iobinary alt01 and alt03. This is expected now. alt01 and alt03 set use_bathymetry to true and the bathymetry file is netcdf. With updated netcdf CPP and new checks, this should fail. There are several options to move forward. We can leave this as is and treat these failures as a correct result. We can implement a binary version for reading the bathymetry file. Or we can remove the alt01 and alt03 tests with iobinary.

Otherwise, this PR is now ready to merge.

@eclare108213
Copy link
Contributor

I'd prefer that we fix the failing tests. alt01 doesn't make a lot of sense, to have bathymetric stress on while kdyn=0 (which is off, right?). Suggestion: in alt01, set

basalstress = .false.
use_bathymetry = .false.

and in alt03, where the bathymetric could be exercised, change the I/O to netcdf?

We/I do need to go through all of these tests and make sure they are sensible.

@apcraig
Copy link
Contributor Author

apcraig commented Jul 20, 2020

I'm happy to update a bunch of tests and agree they should be reviewed. These tests are run on cheyenne specifically. What we're doing is running the same suite of tests with 4 io options; binary, netcdf, pio1 and pio2. That suite is hopefully valdiating IO in many different configurations. It's also turning on ALL the possible history variables. So we already have equivalent tests with netcdf, pio1, and pio2, so that already works. It's just the binary tests are failing when use_bathymetry is false because of some of the new logic designed to catch those issues as well as truly turn off netcdf when the CPP is not set. What was happening before is that netcdf was still being compiled in some places in the code even when -Dncdf was not set.

If we change the alt01 configuration, then the results will change on all machines. I propose we leave things as they are now with regard to testing and then create a PR that specifically updates the test configuration settings after review of all tests. That would be run separately on several machines to verify answer changes and make sure all the tests run. In other words, I sort of prefer to separate this PR and changes to the test suite. Does that make sense?

@eclare108213
Copy link
Contributor

So the I/O tests are a separate set, not part of base_suite? Then your suggestion makes sense, and I'll still offer my suggestion for the future PR reviewing/changing the basic tests.

@apcraig
Copy link
Contributor Author

apcraig commented Jul 21, 2020

The I/O tests are a separate suite that is just run on cheyenne.

@eclare108213, if you want to suggest some changes to current tests or some new tests, I'm happy to implement them, test them on several platforms, and create a PR. One thing I'll request is a test that turns on modal aerosols. I think that's the next bullet on my list to improve coverage. Any by the way, the code coverage results are here, https://apcraig.github.io/ in case that drives any discussion. I would love to help more in this area, but I don't have a good feel for what options work well together (or don't). #437 is an issue where some of this could be discussed if needed. Thanks!

@apcraig
Copy link
Contributor Author

apcraig commented Jul 24, 2020

I think this is ready for merging, just waiting for some review. Once this is merged, I will create a PR for the "coupled" CPP based on discussions in #288.

@phil-blain
Copy link
Member

@apcraig OK I'll try to take a look this afternoon. Just a heads up, I'm doing the Argonne Training Program in Extreme Scale Computing next week and the week after so I'll have a lot less time.

@phil-blain
Copy link
Member

I'd really like to take a look though.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the changes here, and they look okay to me - nice clean-up of some messy code. Would be good to have more eyes on it.

@TillRasmussen
Copy link
Contributor

I have mostly looked at the netcdf and they generally look fine. I still think that some of the cpp flags USE_NETCDF can be removed if a few optional parameters are added to ice_open_nc (for instance a check of dimension size).I have not tried to compile nor compared two runs but i can go through and change these in a new pull request. When this is approved.

With regards to reading the bathymetry then this is only implemented to read the depth for basal stress (please correct me if I am wrong). This is definitely not used when kdyn is 0 nor when basalstress=.false. This will also be needed if grounded icebergs are included. Therefore I think that one could add a if (kdyn>0) && basalstress && icebergs (when included) ) get_bathymetry or get_bathymetry_pop. This would avoid reading the bathymetry when it is not needed.

Copy link
Member

@phil-blain phil-blain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had time to review ice_forcing.F90 this morning. Here are my comments. I'll continue my review tomorrow (or maybe later today).

Comment on lines -970 to -972
#else
field_data = c0 ! to satisfy intent(out) attribute
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this code is deleted because it should never be reached ? is that the logic ? we should never call read_data_nc if USE_NETCDF is not active ? (I'm not familiar yet with this part of the code, sorry if I'm asking questions that seem evident).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it should not reach this part, however if it for whatever reason reaches this part anyway then it should fail in the call to ice_open_nc as the netcdf file cannot be read. Here should be an abort and a cpp flag.

read_data_nc do not use the netcdf library therefore it should be able to run through without warnings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If read_data_nc doesn't use the netcdf library at all then it shouldn't have the _nc suffix on the subroutine name. I think this routine has evolved over time.

The field_data=c0 line was necessary years ago because a compiler (I don't remember which one, most likely intel) wouldn't make it through the build without it, when the cpp was not invoked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need the protection of the USE_NETCDF or the #else here. The code will build fine, there are no calls directly to netcdf in this subroutine. What it does is call _nc routines which will abort if USE_NETCDF is not set. So anytime this routine is called when USE_NETCDF is set, it should work. Anytime this routine is called when USE_NETCDF is not set, it will abort in a lower level subroutine. I think that's the right behavior.

@@ -3664,7 +3649,6 @@ subroutine ocn_data_ncar_init
'T', 'S', 'hblt', 'U', 'V', &
'dhdx', 'dhdy', 'qdp' /

#ifdef ncdf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this one not converted to USE_NETCDF ? this will cause warnings if we eventually clean up the code so that it compiles without warnings with -Wall (which I think we should do eventually...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it is ok to remove it but at the same time on line 3704 and onwards there are calls to nf90_inq_dimid and nf90_inquire_dimension. I think that these should be moved to ice_open_nc as optional parameters. Then the surrounding USE_NETCDF should also be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ifdef on line 3667 is not needed. The check at line 3704 takes care of it. I'm not sure I see a problem in the section of code between 3666 and 3685 that would create a problem for -Wall (and I agree we should work towards that capability). The code refactoring here was for the cpps, not netcdf. I agree there are many things that could be done to improve the netcdf implementation or even the io implementation overall (like merging the io_binary, io_netcdf, and io_pio and make them run-time settable on a per file type basis). At this point, I think a netcdf cleanup should be done separately. But I agree with the strategy that @TillRasmussen has outlined in terms of improving the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that if we remove the CPP around the variable declarations, and compile with -Wall and without USE_NETCDF, the compiler will complain that these variables are declared but unused. It's a minor issue but I thought I'd point it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good point.

@@ -3741,7 +3724,10 @@ subroutine ocn_data_ncar_init
enddo ! month loop
enddo ! field loop

if (my_task == master_task) status = nf90_close(fid)
if (my_task == master_task) call ice_close_nc(fid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if (my_task == master_task) check seems redundant, since it's done in ice_close_nc, no?

Copy link
Contributor

@TillRasmussen TillRasmussen Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes (as in I agree)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The master_task if is redundant. I did not want to change the current implementation too much so I left it as is. The redundancy is not a problem. I agree this could be cleaned up. But once you start down that path, there is a lot to do. Just look at the whole ocn_data_ncar_init routine as one example. My goal here was to change the nf90_close to a call to ice_close_nc because it was easy to do and then matches the ice_open_nc above.

@@ -3902,7 +3888,7 @@ subroutine ocn_data_ncar_init_3D
enddo ! month loop
enddo ! field loop

if (my_task == master_task) status = nf90_close(fid)
if (my_task == master_task) call ice_close_nc(fid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here.

@@ -5078,8 +5040,7 @@ subroutine ocn_data_ispol_init
enddo ! month loop
enddo ! field loop

if (my_task == master_task) status = nf90_close(fid)
#endif
if (my_task == master_task) call ice_close_nc(fid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

write (nu_diag,*) "wave spectrum file not available, using default profile"
call abort_ice (subname//'ERROR: wave_spec_file '//trim(wave_spec_file), &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we aborting here when we weren't before? If we are aborting we should not output "using default profile", no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier I think that the abort should be in the ice_open_nc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did add the abort because I thought it was the right thing to do. I think it's dangerous for a user to setup use of the wave spec file and then for it to relatively silently revert to the default values at runtime. What should happen is the user should modify their namelist so it uses the default values and does not try to read a file. That would give the same result but be explicit. Another option would be to add a binary file reading capability for when netcdf is not available, but that's probably not the way to go at this time. I would be interested in hearing what others think about this silent mode. We have shifted the overall implementation from being one that changes the user settings at runtime to fix conflicts to one that aborts when there are conflicts. I think this is a place where we want to follow that strategy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is dangerous to revert to a default setting if the file is missing. This may happen from time to time in a operational setup due to delays of a required data flow. The abort option is better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That indeed makes sense. This is the kind of explanations that I think we should put in commit messages :) This way we can read the commit messages and look at the changes and understand faster why these changes are made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we should do a better of explaining changes in either our commit messages or the PR. I'll try to do better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, I would prefer we use the commit messages; this way the information is encoded in the repo. If GitHub goes under someday for whatever reason, the commit messages stay, but the PR discussions might be more difficult to retrieve. I think I read somewhere (a very early PR?) that the GitHub projects for CICE and Icepack themselves are backed up somewhere (how regularly?), but this is not documented anywhere...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added a couple sentences to the git workflow guide to indicate we want comprehensive commit messages. I agree it's the right way to go.

@eclare108213
Copy link
Contributor

Thanks @phil-blain @TillRasmussen for looking at this code. As you've likely surmised, it's more of a "working code" than the rest of the model. Over the years I allowed individual institutions/people to add what they needed for their projects, without worrying too much about SE. The newer forcing routines were never fully supported by me (since I didn't have the configuration files) and the forcing module was always intended to just be guidance for other users. So I'll take the blame for the mess. This part of the code probably could also use a "deep cleaning" aka refactor to make it more elegant, robust, extendable, etc., if we think it's worth doing (but not in this PR).

@apcraig
Copy link
Contributor Author

apcraig commented Jul 28, 2020

Thanks @eclare108213, I was going to make similar comments. I think the IO could be cleaned up in smaller and/or bigger ways as suggested above. But I think that's a bigger task that needs some requirements and design steps. This PR is really focused on the CPPs. The main issue with the netcdf was that when USE_NETCDF (ncdf) was not set, there were netcdf calls that were still being compiled. This was generally not a problem because users generally load netcdf so it could compile, but that's not correct. My feeling is that if we are going to have a USE_NETCDF flag, it should be implemented properly. I have verified that the model compiles if I don't load netcdf and I don't set USE_NETCDF (which it wasn't doing before). I have also verified that the code aborts if USE_NETCDF is not set when it gets to a netcdf file (like bathymetry), so that is also working better now.

@@ -661,6 +677,7 @@ subroutine read_restart_field(nu,nrec,work,atype,vname,ndim3, &

character(len=*), parameter :: subname = '(read_restart_field)'

#ifdef USE_NETCDF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this ifdef is not needed, I don't see any direct calls to NetCDF subroutines in read_restart_field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. Every other subroutine in this file has the USE_NETCDF logic, so I might argue we should keep it here too for consistency and in case direct NetCDF calls are added later. I guess I propose we keep it for now.

Copy link
Member

@phil-blain phil-blain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few comments after looking at the documentation changes.

doc/source/cice_index.rst Outdated Show resolved Hide resolved
doc/source/user_guide/ug_case_settings.rst Outdated Show resolved Hide resolved
"",""
"**General Directives**", ""
"CESM1_PIO", "Provide backwards compatible support for PIO interfaces/version released with CESM1 in about 2010"
"coupled", " "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional to not describe "coupled" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because I don't know what to say. It does a bunch of things. I have already started working on a new PR that removes the coupled CPP flag following some discussion in #288.

doc/source/user_guide/ug_case_settings.rst Outdated Show resolved Hide resolved
doc/source/user_guide/ug_running.rst Outdated Show resolved Hide resolved
doc/source/user_guide/ug_running.rst Outdated Show resolved Hide resolved
doc/source/user_guide/ug_running.rst Outdated Show resolved Hide resolved
doc/source/user_guide/ug_running.rst Outdated Show resolved Hide resolved
apcraig and others added 8 commits July 29, 2020 09:08
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
@apcraig
Copy link
Contributor Author

apcraig commented Jul 31, 2020

@phil-blain, do you want more time to review this PR? Trying to get a few things merged today if possible. But happy to wait if that's better.

Copy link
Member

@phil-blain phil-blain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every thing looks good, merge away!

@apcraig apcraig merged commit 819eedd into CICE-Consortium:master Jul 31, 2020
phil-blain added a commit to phil-blain/CICE that referenced this pull request Aug 13, 2020
Bring the name of the new macro 'CICE_USE_LAPACK' more in line with the
changes to the CPP macros in 819eedd (Update CPP implementation (CICE-Consortium#490),
2020-07-31) by renaming it to 'USE_LAPACK'.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Aug 25, 2020
Bring the name of the new macro 'CICE_USE_LAPACK' more in line with the
changes to the CPP macros in 819eedd (Update CPP implementation (CICE-Consortium#490),
2020-07-31) by renaming it to 'USE_LAPACK'.
apcraig pushed a commit that referenced this pull request Sep 22, 2020
* initial coding of ice_dyn_vp.F90

* code (matvec, bvec and residual) now compiles after debugging

* introduced calc_vrel_Cb and calc_visc_coeff subroutines and precond.F90 file

* after debugging...now compiles

* in process of adding precondD

* precondD is in progress

* I finished coding precondD...it compiles

* added calculation for L2norm

* modif to ice_step_mod.F90 for call to imp_solver

* now form complete vectors by alternating u(i,j) and v(i,j) and so on

* added simple routine to form a vector

* corrected bug for vectors of size ntot

* added vectors of size ntot to bvec and matvec

* step back...not so easy to form all the vectors for the blocks...

* new subroutine for assembling ntot vectors

* In the process of adding the call to fgmres

* new subroutine to convert vector to the max_blocks arrays

* ice_dyn_vp compiles...I now need to make fgmres.F90 compile...

* added BLAS routines needed by fgmres

* added the fgmres.F90 routine...does not compile yet...

* minor modif to fgmres.F90

* bug in call to arrays_to_vec instead of vec_to_arrays

* Remove fgmres.F90 (MPI version, will re-add GEM version later)

This file 'include's mpif.h and so breaks serial compilation.
I will copy the FGMRES solver from GEM later on and modify
it to call the CICE communication routines.

* added serial fgmres for testing

* modified code for serial tests

* after debugging...

* added code to put sol vector in uvel and vvel arrays at the end of fgmres

* modified tinyarea calc for VP (punyVP=2e-9)

* modified location of allocation for some variables...

* small bug corrected in allocating variables

* introduced capping of the viscous coeff following Kreysher

* added a relaxation to uvel,vvel to try to get nonlinear conv

* modif to relaxation calc...minor

* I found a bug in my 1st implementation. The dsig/dx term related to the rep pressurer should be on the RHS...in the process of correcting this

* Pr is now calc with zeta

* dPrdx is now in calc_bvec

* removed Pr part from stress_vp calc

* minor modifs to improve print of res norm in fgmres

* added linear conv criterion based on a gamma

* added NL conv criterion

* in the process of adding precond diag

* now forms the vector diagvec

* precond diag is completed and compiles

* diagvec was not allocated...now ok

* diagvec was not deallocated...now ok

* fixed bug in formDiag_step2

* small modif

* small modif...again

* modified formation of diagonal for precond

* just removed a comment

* minor modif to formDiag_step1

* added an option for precond (free drift diag or complete diag

* minor modif

* minor modif

* removed OLDformDiag_step1

* Ok I think I know what was wrong with the precond diag...I have to use icellu and not icellt

* precond 3 works but not as good as precond 2 (free drift)...removed OLD code

* ok found the problem in step2...now fixed

* added routine to calc deformations for mech redistribution...for the evp this is done for k=ndte in stress...here it is done once we have NL convergence

* chnaged names of evp_prep1 to dyn_prep1...

* in process of adding gmres as precond to fgmres...

* matvec is now done in one routine instead of in two steps before

* in the process of adding gmres precond

* finalizing implementation of gmres for precond

* pgmres seems to work...I used it as a solver and it works fine

* minor modif: introduction of maxits_pgmres

* modif to comments...very minor

* something was inconsistent for precond choice...now fixed and removed one precond option (free drift)

* removed krelax...not needed as a more complex method will be added

* in the process of adding the RRE1 acceleration method

* fixed problem in halo update (spherical grid...wraps on itself) before matvec

* cicecore: add puny_dyn variable to initialize tinyarea, remove temporary punyVP variable

* ice_dyn_vp: move solver algorithm and output parameters to namelist

Also, since 'yield_curve' and 'e_ratio' are used for both VP and EVP dynamics,
print their values for both solvers.

* {f,}gmres: use nu_diag for output

* dynamics: correct 2 bugs found using debugging flags

- eps1 was printed before being initialized in pgmres
- a loop on iblk was missing around the call to deformations in ice_dyn_vp

* ice_dyn_vp: rename VP namelist variables to be more descriptive

Also, add namelist flag to monitor nonlinear convergence
and document VP namelist variables in runtime log.

* pgmres: streamlined monitoring output

Bring the PGMRES output more in line with the FGMRES and nonlinear
solvers.

* ice_dyn_vp: add computation of nonlinear and fixed point residuals

* ice_dyn_vp: remove references to RRE acceleration method

We will not use this acceleration method,
so remove remaining references to it.

* ice_dyn_vp: remove unused 'ulin', 'vlin' variables

* ice_dyn_vp: move Picard iteration to separate subroutine

Make the code more modular by moving the Picard solver
to its own subroutine.

Also, rename 'kmax' to 'maxits_nonlin' to bring it in line with
'maxits_{f,p}gmres'.

* ice_dyn_vp: add Anderson acceleration for Picard iteration

* Remove nonlinear convergence logic from fgmresD.F90 (move to picard_solver in ice_dyn_vp.F90)
* Add ouput of residual norm to fgmresD.F90
* Remove unused variables (fgmresD.F90, pgmres.F90, ice_dyn_vp.F90)
* Fix warning for format 1008 (ice_init.F90)
* Move FGMRES solver to separate subroutine (ice_dyn_vp.F90)
* Remove unused use statements in picard_solver (ice_dyyn_vp.F90)

* options: add options files for implicit solver

- diag_imp sets all monitoring options (nonlin, {f,p}gmres) to on
- dynanderson sets the implicit solver to on and the nonlinear algorithm
  to Anderson acceleration
- dynpicard sets the implicit solver to 'on' and the nonlinear algorithm
  to Picard iteration
- run3dt runs the model for three time steps

* ice_dyn_vp: correct residual computations if max_blocks > 1

The way we compute the nonlinear residual norm does not take into
account that `L2norm` is an array of size `max_blocks`.

Fix that by summing the components, squared, and taking the square root
(we need to square the components because the subroutine 'calc_L2norm'
returns the square root).

* ice_dyn_vp: differentiate fixed point and progress residuals

For Anderson acceleration, the fixed point residual (g(x) - x) is not
the same as the 'progress residual' (u-u_old).

Make this distinction clear in the code and in the monitoring output.

* ice_dyn_vp: fix convergence issues in Picard and Anderson solvers

Both solvers currently converge, but the solution (uvel,vvel)
is not qualitatively the same as the EVP solution; velocities are
too small.

For picard_solver, it is due to the fact that the initial residual
stored and used to check for convergence is not the initial residual at
the start of the nonlinear iteration, but the last residual at the end
of the first FGMRES iteration. Since this residual is smaller,
convergence is "faster" but the solution is not the same; it is not
converged enough in the sense of the nonlinear residual. This is an
implementation error that was introduced when the nonlinear convergence
logic was removed from the fgmres solver.

For anderson_solver, it is due to using the fixed point residual norm to
check for convergence.  Similarly, this residual is smaller than the
nonlinear residual and thus leads to a "faster" convergence, but to a
different solution (not converged enough in the sense of the nonlinear
residual).

Fix both errors.

* ice_dyn_vp: add (ulin,vlin) to compute vrel based on 2 previous iterates

Initial test shows that using the mean of the 2 previous iterates to
compute vrel accelerates convergence for Picard iteration (this is
robust).

For Anderson it is sometimes better, sometimes worse, so by
default it is set to false in the namelist.

This also removes the oscillation produced by Picard iteration, which
is in line with results in doi:10.1029/2008JC005017

* machines: cesium: add LAPACK and BLAS library to Macros

* ice_dyn_vp: add 'subname' variable to each subroutine

* ice_dyn_vp: cleanup 'icepack_warnings_*' calls

Some calls to Icepack warning interfaces are not needed; remove them.

Another call is misplaced; move it to the right place.

* ice_dyn_vp, pgmres: remove unused arguments, variables and 'use' statements variables

* ice_dyn_vp: refactor 'puny' treatment for implicit solver

The way 'puny_dyn' is set in the 'cice' driver, and in 'ice_constants'
and 'ice_grid' is kind of hacky. Revert those changes.

Instead, add an 'init_vp' subroutine that is called from the drivers and
that recomputes 'tinyarea' with the new 'puny_vp' value.

Call 'init_evp' from 'init_vp', mimicking what is done ine 'init_eap'.

* ice_dyn_vp: make 'calc_bvec' use already computed 'vrel'

Pass the already computed 'vrel' as an argument instead of recomputing
it.

* ice_dyn_vp: remove references to fgmres2 subroutine

We still reference 'fgmres2', an  MPI implementation of FGMRES that was
never completely integrated with the implicit solver.

Remove references to it as well as its arguments.

* dynamics: remove 'BLAS_routines.F90'

It is a best practice to use the BLAS library natively available on a
system, or install a high performance implementation like ATLAS,
OpenBLAS, etc.

Remove the file 'BLAS_routines.F90', which contains BLAS routines copied
from the reference NETLIB implementation.

* ice_dyn_vp: synchronize 'imp_solver' with ice_dyn_evp::evp

The 'imp_solver' subroutine is the main driver of the implicit VP solver
and was copied from a very old version of the 'evp' EVP driver in
ice_dyn_evp.F90.

Bring 'imp_solver' in line with changes in 'evp', and move some
statements around to keep related statements together.

* dynamics: refactor strain rates and deformation calculations

Create subroutines for strain rates and deformations computations, add
them to ice_dyn_shared and call them in ice_dyn_evp and ice_dyn_vp.

This reduces code duplication.

* dynamics: add sol_fgmres2d from GEM as ice_krylov.F90 (does not compile)

Copy (verbatim) the 2D FGMRES solver from the GEM atmospheric model
as 'ice_krylov.F90'.

* ice_krylov: adapt FGMRES algo up to inner loop (WIP)

* ice_krylov: adapt FGMRES algo up to preconditioner call (WIP)

* ice_krylov: adapt FGMRES algo up to before CGS (WIP)

* ice_krylov: adapt FGMRES algo up to end of Arnoldi (except 1st GS loop) (WIP)

* ice_krylov: adapt FGMRES algo up to updating of solution (WIP)

* ice_krylov: adapt FGMRES algo up to residual calculation (WIP)

* ice_krylov: correct CICE kinds in function almost_zero

* comm: add 'global_sums' module procedure

Add a 'global_sums' module procedure that computes the global sum of
several scalars, held in a 1D array ('vector').

Make use of the existing 'compute_sums_dbl' subroutine.

* ice_krylov: adapt FGMRES including Gram-Schmidt (WIP)

* ice_krylov: add 'pgmres' subroutine (mostly the same as 'fgmres')

The PGMRES algorithm is the same as using FGMRES with the same
preconditioner at each iteration.

However, in order to reuse the FGMRES code the 'fgmres' subroutine would
have to be a recursive subroutine, which might have performance
implications.

For now, add it a separate subroutine.

* ice_krylov: move subroutines to ice_dyn_vp to work around circular deps

'ice_krylov' and 'ice_dyn_vp' modules each 'use' variables from each
other, which makes them impossible to compile in a clean build.

Transfer the subroutines in 'ice_krylov' to 'ice_dyn_vp'.

While at it, update the public interface of module ice_dyn_vp
to only expose what is used in other modules.

Also, make 'icellu', 'icellt' and 'indxtij' et al. module variables, to
reduce subroutine argument counts.

In the same spirit, add use statements.

* ice_dyn_vp: add 'subname' to subroutine 'almost_zero' and correct indentation

* ice_dyn_vp: add 'ice_HaloUpdate_vel' subroutine

Introduce the subroutine 'ice_HaloUpdate_vel', which can be used to do
halo updates for vector field, such as the ice velocity (uvel,vvel).

* ice_dyn_vp: make 'fld2' a module variable

Reduce subroutine argument counts by making 'fld2' a module variable, and
allocate it in 'init_vp'.

* ice_dyn_vp: move 'ice_halo' to module 'use' statement

* ice_dyn_vp: add workspace initialization to 0 and haloupdate before matvec in fgmres (WIP)

* ice_dyn_vp: add choice between classical and modified Gram-Schmidt orthogonalization

The FGMRES solver from GEM only has classical Gram-Schmidt (CGS), but
modified Gram-Schmidt (MGS) is more robust. Preliminary tests show that
the convergence of the non-linear solver is seriously degraded when CGS
is used instead of MGS inside FGMRES.  This was tested by changing MGS
to CGS in the Saad FMGRES implementation :

diff --git a/cicecore/cicedynB/dynamics/fgmresD.F90 b/cicecore/cicedynB/dynamics/fgmresD.F90
index 5a93e68..c7d49fa 100644
--- a/cicecore/cicedynB/dynamics/fgmresD.F90
+++ b/cicecore/cicedynB/dynamics/fgmresD.F90
@@ -192,12 +192,13 @@ subroutine fgmres (n,im,rhs,sol,i,vv,w,wk1, wk2, &
 !      if (icode .eq. 3) goto 11
       call  dcopy (n, wk2, 1, vv(1,i1), 1) !jfl modification
 !
-!     modified gram - schmidt...
+!     classical gram - schmidt...
 !
       do j=1, i
-         t = ddot(n, vv(1,j), 1, vv(1,i1), 1) !jfl modification
-         hh(j,i) = t
-         call daxpy(n, -t, vv(1,j), 1, vv(1,i1), 1) !jfl modification
+         hh(j,i) = ddot(n, vv(1,j), 1, vv(1,i1), 1) !jfl modification
+      enddo
+      do j=1, i
+         call daxpy(n, -hh(j,i), vv(1,j), 1, vv(1,i1), 1) !jfl modification
       enddo
       t = sqrt(ddot(n, vv(1,i1), 1, vv(1,i1), 1)) !jfl modification
       hh(i1,i) = t

Add an 'orthogonalize' subroutine that can do either CGS or MGS to
abstract away the orthogonalization procedure.  Add a namelist parameter
'ortho_type' (defaulting to 'mgs') so that the user can change at run
time if the FGMRES solver should use CGS or MGS as an orthogonalization
method.
Note: at the moment the 'pgmres' subroutine is still hard-coded to use CGS.

* ice_dyn_vp: change order of arguments for 'precondition' subroutine

The arguments with 'intent(out)' and 'intent(inout)' are usually listed
last in the argument list of subroutines and functions. This is in line
with [1].

Change 'precondition' to be in line with this standard.

[1] European Standards for Writing and Documenting Exchangeable Fortran 90 Code,
https://wiki.c2sm.ethz.ch/pub/COSMO/CosmoRules/europ_sd.pdf

* ice_dyn_vp: fgmres: correctly initialize workspace_[xy] and arnoldi_basis_[xy]

A previous commit only initialied them in pgmres.
This should be squashed with 1c1b5cf (Add workspace initialization to 0
and haloupdate before matvec in fgmres (WIP), 2019-07-24)

* ice_dyn_vp: matvec: change Au,Av from 'intent(out)' to 'intent(inout)'

According to The Fortran 2003 Handbook [1], the Fortran standard states
that arguments with 'intent(out)' are left uninitialized if the
subroutine/function does not set them. This is dangerous here since the
'Au', 'Av' arguments are set in 'matvec' only for grid cells where there
is ice, so any grid cells with no ice that were previously initialized
(eg. to zero) in the actual arguments could be left with garbage values
after the call to 'matvec' (this does not happen with the Intel compiler
but it's still better to follow the standard).

[1] J. C. Adams, W. S. Brainerd, R. A. Hendrickson, R. E. Maine, J. T. Martin,
    and B. T. Smith, The Fortran 2003 Handbook. London: Springer London, 2009.

* ice_dyn_vp: clarify decriptions of FGMRES variables

User 'Arnoldi' and 'restarts' in addition to 'inner' and 'outer'.

* ice_dyn_vp: add subroutines description and clarify existing ones

* ice_dyn_vp: fix comments in FGMRES algorithm

The next block does not only compute the new Givens rotation, it applies
it, so tweak the comment to be in line with the code.

* ice_dyn_vp: fgmres: fix misleading check for return

Using `outiter > maxouter` means that the algorithm would always perform
at most `maxouter+1` outer (restarts) iterations, since at the end of the
`maxouter`-th iteration we would have `maxouter > maxouter`, which is false,
so the algorithm would do a last outer iteration before returning.

Fix this by using ">=" so that the `maxouter` value is really the maximum number of
outer iterations permitted.

This error was in the GEM version.

* ice_dyn_vp: fgmres: fix error in the last loop computing the residual

workspace_[xy] was rewritten at each iteration instead of being
incremented.

Fix that.

* ice_dyn_vp: change the definition of F(x) from 'A.x - b' to 'b - A.x'

This makes the result of the 'residual_vec' subroutine conform with the
definition of the residual in the FGMRES algorithm (r := b - A.x).

This should have no other impact than making the new fgmres subroutine
actually work.

* ice_dyn_vp: fgmres: comment out early return (WIP)

This is different than the legacy FGMRES and thus needs to be commented
to compare both versions of the algorithm.

* ice_dyn_vp: pgmres: synchronize with fgmres (WIP)

Also remove unnecessary return statement at the end of fgmres.

* ice_dyn_vp: picard_solver: re-enable legacy solver

Rename the subroutines fgmres and pgmres in standalone files
with a '_legacy' suffix.

* cicecore: remove remaining 'SVN:$Id:' lines

* ice_dyn_vp: remove 'picard_solver' subroutine

The 'picard_solver' subroutine has been abandoned for quite a while
and the Picard algorithm can be used by using the Anderson solver
with `im_andacc = 0`, i.e. not saving any residuals.

Remove the subroutine and move the logic around the namelist variable
'algo_nonlin' from 'imp_solver' to 'anderson_solver'.

This way, the new developments made in 'anderson_solver' (using the FGMRES
solver from GEM, choosing the orthogonalization method, the addition of the
precondition subroutine) can also be used with `algo_nonlin = 1`.

* ice_dyn_vp: remove legacy FGMRES solver

The new FGMRES solver adapted from GEM was already verified to give the
same results as the legacy implementation.

Remove the legacy implementation.

* ice_dyn_vp: pgmres: use Classical Gram-Schmidt

The legay PGMRES preconditioner uses Classical Gram-Schmidt
for orthogonalization, so in order to compare the new solver
with the old one we must use CGS in PGMRES.

A following commit will add a namelist variable to control this
behavior.

* ice_in: change maxits_{f,p}gmres to be in line with new meaning

Change the reference namelist to conform with the new  meaning of
`maxits_{f,p}gmres`, which is different between the old and the new
{F,P}GMRES solvers:

- in the old solvers 'maxits_*' is the total number of inner (Arnoldi) iterations
- in the new solvers 'maxits_*' is given as the 'maxouter' argument,
  i.e. the total number of outer (restarts) iterations and so
  'maxits' from the old solvers correspond to 'maxouter*maxinner'

This means that in the namelist 'maxits_*' must be set to 1 if it was previously
set to the same thing as 'im_*' (i.e., do not do any restarts).

* ice_dyn_vp: uniformize naming of namelist parameters for solver tolerances

'gammaNL', 'gamma' and 'epsprecond' refer respectively to the relative tolerances
of the nonlinear solver, linear solver (FGMRES) and preconditioner (PGMRES).

For consistency and to use meaningful names in the code, rename them to 'reltol_nonlin',
'reltol_fgmres' and 'reltol_pgmres'.

* ice_dyn_vp: anderson_solver: adapt residual norm computations for MPI

The norm of the different residuals need to be computed using the squared
value of all components across MPI processes.

We keep the 'res' and 'fpfunc' vectors as 1D vectors for now, to minimize code
changes. This will be revised when the Anderson solver is parallelized.

* ice_dyn_vp: write solver diagnostics only for master task

Note that care must be taken not to call 'global_sum' inside an
`if (my_task == master_task)` construct, as this will create an incorrect
communication pattern and ultimately an MPI failure.

* ice_dyn_vp: remove references to EVP

Reomve leftovers references from the initial copy of ice_dyn_evp.F90

* ice_dyn_vp: rename 'puny_vp' to 'min_strain_rate'

This value really is a minimum strain rate, so name it as such.

* ice_dyn_vp: remove "signature" comments from JF

These comments do not add value to the code, remove them.

* ice_dyn_vp: write stresses at end of time step

In the EVP solver, stress are subcycled using the global
'stress{p,m,12}_[1-4]' arrays, so these arrays contain the subcycled
stresses at the end of the EVP iterations.

These global arrays are used to write the stresses to the history tape
if the stress variables are enabled in the namelist.

In the VP solver, stresses are not explicitly computed to solve the
momentum equation, so they need to be computed separately at the end of
the time step so that they can be properly written to the history tape.

Use the existing, but unused, 'stress_vp' subroutine to compute the
stresses, and call it at the end of the 'imp_solver' driver. Remove
uneeded computations from 'stress_vp'.

Move the definition of the 'zetaD' array to 'imp_solver' so it can be
passed to 'stress_vp'.

* ice_dyn_vp: cleanup 'intent's

- move 'intent(out)' and 'intent(inout)' arguments to the end of the
  argument list (ice_HaloUpdate_vel is an exception, for consistency
  with other ice_HaloUpdate subroutines).
- move 'intent(in)' before any 'intent(out)' or 'intent(inout)'
- change 'intent(inout)' to 'intent(out)' if the argument is rewritten
  inside the subroutine

* dynamics: correct the definition of 'shearing strain rate' in comments

All occurences of 'shearing strain rate' as a comment before the
computation of the quantities `shear{n,s}{e,w}` define these quantities
as

    shearing strain rate  =  e_12

However, the correct definition of these quantities is 2*e_12.

Bring the code in line with the definition of the shearing strain rate
in the documentation (see the 'Internal stress' section in
doc/source/science_guide/sg_dynamics.rst), which defines

    D_S = 2\dot{\epsilon}_{12}

Correct these.

* ice_dyn_vp: correctly use 'subname' in 'abort_ice' calls

* ice_dyn_vp: add 'calc_seabed_stress' subroutine

Add a subroutine to compute the diagnostic seabed stress ('taubx' and
'tauby'), and call it from the 'imp_solver' driver.

Note that in EVP, this is done in the 'stepu' subroutine in
ice_dyn_shared.

* dynamics: move basal stress residual velocity ('u0') to ice_dyn_shared

Make 'u0' a module variable in ice_dyn_shared, and use for both EVP and
VP without duplicating the hard-coded value in the code.

* ice_dyn_vp: use 'deformations' from ice_dyn_shared

The 'deformations' subroutine is used in ice_dyn_evp, but not in
ice_dyn_vp.

* ice_dyn_vp: simplify 'calc_vrel_Cb' subroutine

Remove the uneeded variables 'utp', 'vtp' and directly use 'uvel' and
'vvel' instead.

* ice_dyn_vp: rename 'Diag[uv]' to 'diag[xy]' for consistency

The rest of the code uses 'x' ('y') as a suffix for the x- and y-
components of "vector field"-type arrays.

Bring 'Diag[uv]' in line with this convention, and loose the
capitalization.

* ice_dyn_vp: introduce 'CICE_USE_LAPACK' preprocessor macro

The 'anderson_solver' subroutine uses calls from the LAPACK library,
but these are strictly needed only for the Anderson solver, and not for
the Picard solver.

Introduce a preprocessor macro, 'CICE_USE_LAPACK', that can be used to
"preprocess out" any code that uses LAPACK calls.

Refactor the code so that the Picard solver works without LAPACK (this
is easy since we only have one call to 'dnrm2' to remove, and an
alternative implementation already exist, but is commented). Uncoment it
and add an 'ifdef' so that this code is used if 'CICE_USE_LAPACK' is not
defined.

Also, make the Anderson solver abort if CICE was compiled without
LAPACK.

These two changes make the model compile if LAPACK is not available.

Add an option file 'set_env.lapack' to automatically configure the model
compilation upon 'cice.setup' to use LAPACK (note that Macros and env
files for the chosen machine need to be modified to add LAPACK support;
at the moment only 'cesium' is supported).

* ice_dyn_vp: convert 'algo_nonlin' to string

Make the reference namelist easier to use by changing the type of the
'algo_nonlin' variable to a string.

This way, we can use 'picard' and 'anderson' instead of '1' and '2'.

* ice_dyn_vp: convert 'precond' to string

Make the namelist easier to use by converting the 'precond' namelist
variable to a string.

This way we can use 'ident', 'diag' or 'pgmres' instead of '1', '2' or
'3'.

* ice_dyn_vp: convert 'monitor_{f,p}gmres' to logical

Convert the namelist variables 'monitor_{f,p}gmres', which are used as
logical values, to actual logical values.

* ice_dyn_vp: remove unimplemented 'fpfunc_andacc' from the namelist

In the future, we might add a second fixed point function, g_2(x),
instead of the regular Picard fixed point function currently
implemented,

  g_1(x) = FGMRES(A(x), x0, b(x)) - x

Remove the namelist flag 'fpfunc_andacc' from the reference namelist,
and make the model abort if this flag is used, since only g_1(x) is
currently implemented.

* ice_dyn_vp: add input validation for string namelist variables

Add input validation in ice_init for 'algo_nonlin', 'precond' and
'ortho_type'.

Currently, the code that resets 'im_andacc' to zero if the Picard solver
is chosen (algo_nonlin = 1) is inside the 'anderson_solver' subroutine,
which is not clean because 'im_andacc' is used as a dimension for several
local variables. These variables would thus have whatever size
'im_andacc' is set to in the namelist, even if the Picard solver is
chosen.

Fix that by moving the code that sets 'im_andacc' to zero if the Picard
solver is chosen (algo_nonlin = 1) to ice_init.

* ice_dyn_vp: abort if Anderson solver is used in parallel

Since the Anderson solver is not yet parallelized, abort the model if
users attempt to run it on more than one proc.

* ice_dyn_vp: reimplement monitor_{f,p}gmres

The ability to monitor the residual norm of the FGMRES solver
and the PGMRES preconditioner was lost when we transitioned to the
new solver implementation.

Re-add this capability.

* ice_dyn_vp: use 'ice_HaloUpdate_vel' everywhere

By performing the halo update after creating the halo_info_mask in
imp_solver, we can use the ice_HaloUpdate_vel subroutine, increasing
code reuse.

* ice_dyn_vp: move ice_HaloUpdate_vel subroutine to ice_dyn_shared

Move the subroutine to ice_dyn_shared so it can be used in ice_dyn_evp
and ice_dyn_eap.

* ice_dyn_evp: use ice_HaloUpdate_vel

Now that the subroutine is in ice_dyn_shared, use it to increase
core reuse.

* ice_dyn_eap: use ice_HaloUpdate_vel

Now that the subroutine is in ice_dyn_shared,
use it to increase code reuse.

* options: set 'use_mean_vrel' in 'dynpicard'

Preliminary testing suggests convergence is improved when
using the mean of the two previous estimates to compute 'vrel'
when the Picard solver is used.

Set this as the default with option 'dynpicard'.

* conda: add 'libapack' to environment spec and Macros

Add the necessary conda package to build CICE with LAPACK,
so that the Anderson solver can be tested in the conda environment.

* ice_dyn_vp: initialize 'L2norm' and 'norm_squared' to zero

If we don't initialize thess arrays, using any MPI decomposition for which
nblocks /= max_blocks on some process will cause uninitialized values
to be summed when computing the local sum for each block, like in

    nlres_norm = sqrt(global_sum(sum(L2norm), distrb_info))

Initialize them to zero.

* ice_dyn_vp: initialize 'stPr' and 'zetaD' to zero

Since 'stPr' and 'zetaD' are initialized with a loop over 'icellt', but
are used on loops over 'icellu', it is possible that these latter loops
use uninitialized values since 'iceumask' and 'icetmask' can differ at
some grid points since they are not defined using the same criteria (see
ice_dyn_shared::dyn_prep1 and ice_dyn_shared::dyn_prep2).

To avoid using unitialized values, initialize the 'stPr' and 'zetaD'
arrays to zero, so that any index corresponding to a cell with no ice
is set to zero.

This mimics what is done for EVP, where the 'str' array is initialized
to zero in ice_dyn_evp::stress.

* doc: document implicit solver

* ice_dyn_vp: default 'use_mean_vrel' to 'true'

Previous testing reveals it's better to set 'use_mean_vrel' to true
(i.e., use the mean of the 2 previous nonlinear iterates to compute
`vrel`) for the Picard solver, but the picture is less clear for the
Anderson solver.

Since we are only advertising the Picard solver for now, make this flag
default to 'true'.

* ice_in: remove Anderson solver parameters

Since we are only advertising the Picard solver for now,
remove the namelist parameters for the Anderson solver from the
reference namelist.

* ice_dyn_vp: add article references

* ice_dyn_vp: remove unused subroutines

Remove subroutines 'stress_prime_vpOLD', 'matvecOLD' and 'precond_diag'.

'stress_prime_vpOLD' and 'matvecOLD' have been unused since the creation
of the 'matvec' subroutine.

'precond_diag' has been unused since we deactivated the legacy FGMRES
solver.

* ice_dyn_vp: remove unused local variables

* ice_dyn_vp: calc_bvec: remove unused arguments

This completes 56d55c7 (ice_dyn_vp: make 'calc_bvec' use already
computed 'vrel', 2019-06-10)

* ice_dyn_vp: anderson_solver: remove unused arguments

This completes 4616628 (ice_dyn_vp: remove legacy FGMRES solver,
2020-02-17).

* ice_dyn_vp: remove unused 'use'-imported variables in subroutines

* ice_dyn_vp: fix trailing whitespace errors

* ice_dyn_vp: uniformize indentation and whitespace

* ice_dyn_vp: put 'intent's on same lines as types

* ice_dyn_vp: use '==' instead of '.eq.'

* ice_dyn_vp: uniformize whitespace in subroutine arguments

* ice_dyn_vp: calc_bvec: remove unneeded arguments

* ice_dyn_vp: clean up and clarify code comments

* ice_dyn_vp: matvec: remove unneeded local variable

* ice_dyn_vp: rename 'stPrtmp' to 'stress_Pr' and 'Dstrtmp' to 'diag_rheo'

Also rename 'Dstr' to 'Drheo' in formDiag_step[12]

* ice_dyn_vp: rename 'calc_zeta_Pr' to 'calc_zeta_dPr'

The subroutine 'calc_zeta_Pr' computes the derivatives of the
replacement pressure, so rename it to 'calc_zeta_dPr' to bring its name
in line with its computation.

* ice_dyn_vp: rename 'ww[xy]' to 'orig_basis_[xy]'

The arrays 'ww_[xy]' are used to keep the values of the preconditioned
vectors, in order to update the solution at the end of the iteration.
As such, rename them to 'orig_basis_[xy]', since they form a basis of
the solution space and this name describes their purpose better.

* ice_dyn_vp: [fp]gmres: removed unused 'r0' variable and 'conv' argument

Both of these are either leftovers of the conversion from the legacy
FGMRES implementation or of the GEM implementation, but are unused in
our implementation. Remove them

Move the declaration of the BLAS functions in their own type declaration
to avoid having to deal with the trailing `, &`.

* ice_dyn_vp: make 'maxits_nonlin' default to 4

Let's use a small number of nonlinear iterations by default, since
other models with VP solvers are often used with a small number of
iterations.

Add an option file 'set_nml.nonlin5000' to set 'maxits_nonlin' to 5000,
effectively letting the VP solver iterate until 'reltol_nonlin' is
reached.

* ice_dyn_vp: rename 'im_{fgmres,pgmres,andacc}' to 'dim_*'

Since these three parameters denote dimensions (of the FGMRES, PGMRES
and Anderson acceleration solution spaces), name them as such.

* tests: add 'dynpicard' test to base_suite

Add a test using the VP implicit solver to the base_suite.
Let's use the gx3 grid because it is less computationally intensive,
and use a 4x1 decomposition to test the solver in parallel.

* dynamics: remove proprietary vector directives

* ice_init: validate implicit solver arguments only if solver is active

'algo_nonlin', 'precond' and 'ortho_type' are only active if 'kdyn ==
3', so wrap the code validating the values of these flags in an if
block.

* ice_dyn_vp: rename 'imp_solver' to 'implicit_solver'

Use a more explicit subroutine name, which additionnally does not have
any negative connotations.

* doc: add caveat for VP solver (tx1 grid, threading)

* ice_dyn_vp: rename 'CICE_USE_LAPACK' macro to 'USE_LAPACK'

Bring the name of the new macro 'CICE_USE_LAPACK' more in line with the
changes to the CPP macros in 819eedd (Update CPP implementation (#490),
2020-07-31) by renaming it to 'USE_LAPACK'.

* comm: rename 'global_sums' to 'global_allreduce_sum'

Make the purpose of the 'global_sums' interface, i.e. reducing a
distributed variable to a variable of the same shape (MPI_ALLREDUCE)
clearer by renaming it to 'global_allreduce_sum'.

This makes it clearer that in contrast to the 'global_sum' interface,
the resulting variable is of the same shape as the distributed variable.

* dynamics: remove 'ice_HaloUpdate_vel' and introduce '(un)stack_velocity_field'

The 'ice_HaloUpdate_vel' subroutine feels out of place in
'ice_dyn_shared', and moving it to 'ice_boundary' introduces other
problems, including circular dependencies.

Remove it, and introduce two simple subroutines, 'stack_velocity_field' and
'unstack_velocity_field', that are responsible to load the 'uvel' and
'vvel' arrays into the 'fld2' array used for the halo updates.

Use these new subroutines in ice_dyn_{evp,eap,vp}.

* dynamics: rename 'init_evp' to 'init_dyn'

The 'init_evp' subroutine initializes arrays that are used in the EVP,
EAP or VP dynamics. Rename it to 'init_dyn' to emphasize that it is not
specific to EVP.

* drivers: call 'init_dyn' unconditionnally

'init_dyn' is called in both 'init_eap' and 'init_vp', so it makes more
sense for this subroutine to be called unconditionnally in the drivers,
and then call 'init_eap' or 'init_vp' if the EAP or VP dynamics are
chosen.

Co-authored-by: JFLemieux73 <jean-francois.lemieux@canada.ca>
@apcraig apcraig deleted the cpp1 branch August 17, 2022 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants