Skip to content

June 2023 operating system update on Cannon

Bob Yantosca edited this page Jan 2, 2024 · 111 revisions

Overview

What happened?

The operating system on Cannon was replaced during the annual powerdown from June 5-8, 2023. As FAS Research Computing wrote:

As part of our June 5-8, 2023 MGHPCC Downtime, FASRC will be upgrading the cluster operating system from CentOS 7 to RockyLinux 8. Details as to why the transition is taking place are provided on the downtime page.

Why did this happen?

The current CentOS 7 operating system went out of support in 2023. If you want to know the nitty-gritty details, see this post. (TL;DR: RedHat Linux was acquired by IBM, and didn't want to support the open-source CentOS project any longer.)

FASRC has chosen to install RockyLinux 8.7 as the successor operating system to CentOS. RockyLinux is a stable fork of RedHat Linux and is bitwise identical with CentOS7.

How does the OS switch affect me?

This presentation is a good overview:

The key things that you should know are:

  1. All of the software packages (e.g. compilers, netCDF versions, MPI versions, etc.) that you have used under CentOS no longer work under RockyLinux.

  2. FASRC has built several software packages that we need for GEOS-Chem The GEOS-Chem Support Team has also created additional software packages for GEOS-Chem with Spack.

  3. If you have built one or more Conda environments for Python packages, these will probably still work on RockyLinux if you use the mamba package manager. Typing module load python/3.10.9-fasrc01 will activate mamba.

  4. FASRC has also released a Singularity container with the prior CentOS 7 environment. This may be useful for you if you need to keep backwards compatibility with code that was compiled before the switchover.

Software information for RockyLinux

Environment files

Please see our Environment files for RockyLinux section.

FASRC core software packages

Please see our FASRC core software packages for RockyLinux section.

Locally-built software packages

Please see our Locally-built software packages for RockyLinux section.

Python environments

Please see our Python environments and Installing your own Python environments section.

Important post-switchover information

Login nodes now have restricted CPU and memory

The login nodes now restrict you to using 1 core and 4GB of memory. For more intensive applications, you can open an interactive job or submit a batch job on a computational node.

SSH or DNS key errors

Once the operating system switchover to RockyLinux 8.7 has been completed, you may encounter warning messages such as:

WARNING: POSSIBLE DNS SPOOFING DETECTED!

and/or

The RSA host key for login.rc.fas.harvard.edu has changed error messages.

After an update of nodes the SSH key fingerprint of a node may change. This will, in turn, cause an error when you next try to log into that node as your locally stored key will no longer match.

To clear the errors, follow these instructions on the FASRC docs site.

"Version change" warning

When you source the gcclassic.rocky+gnu*.env or gchp.rocky+gnu*.env environment files, you will see this warning printed to the screen.

The following have been reloaded with a version change:
  1) netcdf-c/X.Y.Z-fasrc01 => netcdf-c/X.Y.Z-gcc-A.B.C


The following have been reloaded with a version change:
  1) zlib/1.2.11-fasrc01 => zlib/X.Y.Z-gcc-A.B.C

where X.Y.Z and A.B.C are version numbers.

You may safely ignore this. The message arises because Spack assigns its own version number to externally-loaded modules (such as the FASRC netCDF and zlib modules). In reality, netcdf-c/X.Y.Z-gcc.A.B.C points to the FASRC module netcdf-c/X.Y.Z.fasrc01. But the Lmod module manager (which is used to load the modules into your Linux environment) thinks that these are two different modules and thus prints the warning messages.

You will need to rebuild executables

Make sure to recompile all of your source codes with one of the new compilers on Cannon. These include:

  • GEOS-Chem Classic
  • GCHP
  • IMI
  • CHEEREIO
  • KPP

Executable files that were built before the operating system switchover will not work on RockyLinux, as they rely on CentOS 7 system libraries that were once present on Cannon, but have since been removed.

Setup for using R on RockyLinux

Here is some important information for those who need to use the R programming languge on RockyLinux.

Steve Wofsy wrote:

There are some challenges in using R on the upcoming Rocky 8 installation. Since there is not a consistent set of the required modules for geospatial analysis (GDAL, GEOS, PROJ, UNUNITS,...), it is basically impossible to install packages terra, raster, etc on one's own. We succeeded in getting the spack package manager to work, but there were still challenges in specifying paths to the libraries. So we have adopted the geospatial docker container that is available from git. (Anyone wanting access to the spackR library collection can ask me for instructions on how to use it and not disable the bash shell...). Thanks to Jason Wells for making this happen.

I have attached the instructions for running R from the container. The docker has transparent access to the cannon file system. Also, there is a hack included to allow you to run jobs on the cluster from the container. The relevant files are in

/n/holylfs04/LABS/wofsy_lab/Everyone

and can be accessed by anyone on the cluster. The documentation for setting up this framework and using it is developing, but this text file makes it easy:

/n/holylfs04/LABS/wofsy_lab/Everyone/rocky_docker_run_jobs_export.txt

Validation

The GEOS-Chem Support Team has run several GEOS-Chem and GCHP simulations in order to determine:

  1. Can GEOS-Chem Classic and GCHP run with the new compilers/software modules for RockyLinux?
  2. Can GCHP run across multiple nodes with the new compilers/software modules for RockyLinux?
  3. What are the magnitude of differences in GEOS-Chem/GCHP runs done on CentOS 7 vs. RockyLinux, with the same compiler?
  4. What are the magnitude of differences in GEOS-Chem/GCHP runs done on RockyLinux, but with different compilers?

Setup

In order to answer the above questions, the GEOS-Chem Support Team has run the following 1-month fullchem_benchmark simulations on both the RockyLinux 8.7 and CentOS 7 operating systems, with GNU 10.2.0 and GNU 12.2.0 compilers. GEOS-Chem and GCHP were checked out at the tag 14.2.0-alpha.11.

Run OS Model Res. Compiler Cores (Nodes) Notes
1 CentOS GC Classic 4x5 GNU 10 48 (1) Represents "status quo" on Cannon
2 RockyLinux GC Classic 4x5 GNU 10 48 (1) To check against Run 1 for numerical drift
3 RockyLinux GC Classic 4x5 GNU 12 48 (1) To check against Runs 1 & 2 for numerical drift
4 CentOS GCHP c48 GNU 10 96 (2) Represents "status quo" on Cannon
5 RockyLinux GCHP c48 GNU 10 96 (2) To check against Run 4 for numerical drift
6 RockyLinux GCHP c48 GNU 12 96 (2) To check against Runs 4 & 5 for numerical drift

Results

We summarize the results of our tests below:

Run 2 vs. Run 1

  • Brief description: Compares GC Classic runs on CentOS 7 and RockyLinux 8.7 using the same compiler version.

  • Results: Very small differences in most species, as shown by the mass table and selected printouts. OH increased by 0.11%. Likely attributed to the differences in numerical libraries used by the compilers between CentOS and RockyLinux

    Total atmospheric mass of species: Run 2 vs Run1

    OH metrics Run2 vs Run1

    o3_surface_2v1

    co_surface_2v1

    oh_surface_2v1

  • Takeaways:

Run 3 vs. Run 1

  • Brief description: Compares GC Classic runs on CentOS 7 and RockyLinux 8.7 using different compiler versions.
  • Results: Because Run 3 is identical to Run 2, the comparison of Run 3 vs. Run 1 gives identical results to Run 2 vs. Run 1. This is evident in the table of OH metrics shown below:
    ###############################################################################
    ### OH Metrics
    ###
    ### Left column                     Right column:
    ### Ref = GCC_centos_gnu10 (Run 1)  Ref = GCC_centos_gnu10 (Run 1)
    ### Dev = GCC_rocky_gnu10  (Run 2)  Dev = GCC_rocky_gnu12  (Run 3)
    ###############################################################################
    
    ------------------------------------------------------------
    Global mass-weighted OH concentration [10^5 molec cm^-3]
    ------------------------------------------------------------
    Run 1    : 13.07896632314      Run 1    : 13.07896632314
    Run 2    : 13.09341931491      Run 3    : 13.09341931491
    Abs diff :  0.01445299178      Abs diff :  0.01445299178 
     %  diff :  0.110506            %  diff :  0.110506
    
    ------------------------------------------------------------
    CH3CCl3 (aka MCF) lifetime w/r/t tropospheric OH [years]
    ------------------------------------------------------------
    Run 1    :  4.753380           Run 1    :  4.753380
    Run 2    :  4.746909           Run 3    :  4.746909
    Abs diff : -0.006471           Abs diff : -0.006471
     %  diff : -0.136133            %  diff : -0.136133
    
    ------------------------------------------------------------
    CH4 lifetime w/r/t tropospheric OH [years]
    ------------------------------------------------------------
    Run 1    :  8.019566           Run 1    :  8.019566  
    Run 2    :  8.008516           Run 3    :  8.008516 
    Abs diff : -0.011050           Abs diff : -0.011050 
     %  diff : -0.137782            %  diff : -0.137782 
  • Takeaways: Compiling GEOS-Chem Classic with the GNU 10.2.0 compilers or the GNU 12.2.0 yields identical numerical results. The difference in the OH metrics are small (approx 0.1% or 0.001 absolute). This may be attributed to the difference in onboard libraries used by the compilers between CentOS and RockyLinux.

Run 3 vs. Run 2

  • Brief description: Compares GC Classic runs using 2 different compiler versions on RockyLinux.
  • Results: 100% identical
    ################################################################################
    ### Benchmark summary table                                                  ###
    ###                                                                          ###
    ### Ref = GCC_rocky_gnu10                                                    ###
    ### Dev = GCC_rocky_gnu12                                                    ###
    ################################################################################
    
    -------------------------------------------------------------------------------
    AerosolMass: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
    
    -------------------------------------------------------------------------------
    Aerosols: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
    
    -------------------------------------------------------------------------------
    Emissions: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
    
    -------------------------------------------------------------------------------
    JValues: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
    
    -------------------------------------------------------------------------------
    Metrics: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
    
    -------------------------------------------------------------------------------
    SpeciesConc: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
    
    -------------------------------------------------------------------------------
    StateMet: GCC_rocky_gnu12 is identical to GCC_rocky_gnu10
  • Takeaways: Using RockyLinux 8.7 but Switching GNU compiler versions does not affect simulation results. This is likely due to the fact that both GNU 10.2.0 and GNU 12.2.0 compilers use the same underlying system libraries.

Run 5 vs. Run 4

  • Brief description: Compares GCHP runs on CentOS 7 and RockyLinux 8.7 using the same compiler.

  • Results: OH metrics change by +/- 0.001% or less.

    ###############################################################################
    ### OH Metrics
    ### Ref = GCHP_centos_gnu10
    ### Dev = GCHP_rocky_gnu10
    ###############################################################################
    
    ------------------------------------------------------------
    Global mass-weighted OH concentration [10^5 molec cm^-3]
    ------------------------------------------------------------
    Ref      : 13.16007998549
    Dev      : 13.16022032913
    Abs diff :  0.00014034365
     %  diff :  0.001066
    
    ------------------------------------------------------------
    CH3CCl3 (aka MCF) lifetime w/r/t tropospheric OH [years]
    ------------------------------------------------------------
    Ref      :  4.717519
    Dev      :  4.717485
    Abs diff : -0.000034 
     %  diff : -0.000722
    
    ------------------------------------------------------------
    CH4 lifetime w/r/t tropospheric OH [years]
    ------------------------------------------------------------
    Ref      :  7.958901
    Dev      :  7.958847
    Abs diff : -0.000054
     %  diff : -0.000684
  • The Dev/Ref ratios for O3 varies by +/- 0.01% at the surface.. Other species show similar behavior.

    ox_surface

  • The Dev/Ref ratios for O3 varies by +/- 0.003% in zonal means. Other species show similar behavior.

    ox_zonal

  • Takeaways: Using the same compiler on different operating systems causes very small differences in GCHP output. This is probably attributed to differences in software libraries between CentOS 7 and RockyLinux 8.1.1. One note: The ESMF version used on RockyLinux (8.1.1) is newer than that used on CentOS 7. But the differences observed do not indicate any systematic bias.

Run 6 vs. Run 4

  • Brief description: Compares GCHP runs on CentOS 7 and RockyLinux 8.7 using different compiler versions.
  • Results: Because Run 6 is identical to Run 5, then the comparison of Run 6 vs. Run 4 is also identical to Run 5 vs. Run 4.
    ###############################################################################
    ### OH Metrics
    ###
    ### Left column:                      Right column:
    ### Ref = GCHP_centos_gnu10 (Run 4)   Ref = GCHP_centos_gnu10 (Run 4)
    ### Dev = GCHP_rocky_gnu10  (Run 5)   Dev = GCHP_rocky_gnu12  (Run 6)
    ###############################################################################
    
    ------------------------------------------------------------
    Global mass-weighted OH concentration [10^5 molec cm^-3]
    ------------------------------------------------------------
    Run 4    : 13.16007998549    Run 4    : 13.16007998549
    Run 5    : 13.16022032913    Run 6    : 13.16022032913
    Abs diff :  0.00014034365    Abs diff :  0.00014034365
     %  diff :  0.001066          %  diff :  0.001066
    
    ------------------------------------------------------------
    CH3CCl3 (aka MCF) lifetime w/r/t tropospheric OH [years]
    ------------------------------------------------------------
    Ref      :  4.717519         Run 4    :  4.717519 
    Dev      :  4.717485         Run 6    :  4.717485 
    Abs diff : -0.000034         Abs diff : -0.000034
     %  diff : -0.000722          %  diff : -0.000722
    
    ------------------------------------------------------------
    CH4 lifetime w/r/t tropospheric OH [years]
    ------------------------------------------------------------
    Ref      :  7.958901         Run 4    :  7.958901
    Dev      :  7.958847         Run 6    :  7.958847
    Abs diff : -0.000054         Abs diff : -0.000054 
     %  diff : -0.000684          %  diff : -0.000684
  • Takeaways: Runs 5 and 6 are identical. OH changes by +0.001%, which is negligble. This is a smaller amount of change as seen in GEOS-Chem Classic, however.

Run 6 vs. Run 5

  • Brief description: Compares GCHP runs using 2 different compiler versions on RockyLinux.
  • Results: 100% identical
    ################################################################################
    ### Benchmark summary table                                                  ###
    ###                                                                          ###
    ### Ref = GCHP_rocky_gnu10                                                   ###
    ### Dev = GCHP_rocky_gnu12                                                   ###
    ################################################################################
    
    -------------------------------------------------------------------------------
    AerosolMass: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
    
    -------------------------------------------------------------------------------
    Aerosols: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
    
    -------------------------------------------------------------------------------
    Emissions: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
    
    -------------------------------------------------------------------------------
    JValues: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
    
    -------------------------------------------------------------------------------
    Metrics: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
    
    -------------------------------------------------------------------------------
    SpeciesConc: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
    
    -------------------------------------------------------------------------------
    StateMet: GCHP_rocky_gnu12 is identical to GCHP_rocky_gnu10
  • Takeaways: Using RockyLinux 8.7 but Switching GNU compiler versions does not affect simulation results. This is likely due to the fact that both GNU 10.2.0 and GNU 12.2.0 compilers use the same underlying system libraries.
Clone this wiki locally