Skip to content

Commit 145b67f

Browse files
Initial commit of directory comparison tools (#934)
Adds a new `test/` directory to the top level. Inside are miscellaneous scripts I have used to test bitwise identicality of experiments. Main scripts: - `diff_ROTDIR.sh`: Compares two output directories - `diff_UFS_rundir.sh`: Compares two UFS run directories Other scripts and file are helpers to these two main scripts. May eventually form starting point of a global workflow regression test (#267) Refs #267
1 parent e480093 commit 145b67f

File tree

7 files changed

+672
-0
lines changed

7 files changed

+672
-0
lines changed

test/README.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Global workflow comparison tools
2+
A collection of tools to compare two different global workflow experiments for bitwise identicality.
3+
4+
## Disclaimer
5+
6+
These tools are still a work-in-progress. Use at your own risk. There is no guarantee every relevant file will be compared (but feel free to make a pull request adding more).
7+
8+
# Usage
9+
10+
## Quick start
11+
### To compare two UFS run directories
12+
```
13+
./diff_UFS_rundir.sh dirA dirB
14+
```
15+
Where `dirA` and `dirB` are the two UFS run directories.
16+
17+
18+
### To compare two ROTDIRs
19+
```
20+
./diff_ROTDIR.sh dirA dirB
21+
```
22+
Where `dirA` and `dirB` are the two cycle directories (`.../gfs.YYYYMMDD/HH/`)
23+
24+
OR
25+
26+
```
27+
./diff_ROTDIR.sh rotdir cdate expA expB
28+
```
29+
30+
Where:
31+
- `rotdir` is the root of your rotdirs (the portion of path the experiments share)
32+
- `cdate` is the datetime of the cycle in YYYMMDDHH format
33+
- `expA` and `expB` are the experiment names ($PSLOT) of each experiment
34+
35+
## Description
36+
37+
There are currently two tools included in this package:
38+
* `diff_UFS_rundir.sh` will compare two UFS run directories (must have retained them by setting `KEEPDATA` to `NO` in config.base)
39+
* `diff_ROTDIR.sh` will compare entire ROTDIRs
40+
41+
Both scripts work similarly. You will need two experiments to compare. Typically this means a "baseline" experiment using the current develop and whatever feature you are working on. Experiments need to be for the same cycle and use all the same settings, otherwise there is no chance of them matching. Except for specific text files, file lists are constructed by globbing the first experiment directory, so if the second experiment contains files that would otherwise be included, they will be skipped.
42+
43+
There are three classes of files compared:
44+
- Text files, by simple posix diff
45+
- GRiB2 files, using correaltion from `wgrib2`
46+
- NetCDF files, using NetCDF Operators (nco)
47+
48+
Text and grib2 files are processed first and complete quickly. NetCDF processing is currently a lot slower.
49+
50+
Any variables listed in the coordinates.lst file will be ignored when comparing NetCDFs. This is because coordinate variables are not differenced, so when iterating through the variables of the difference they will be non-zero.
51+
52+
## Output
53+
54+
Output will appear like this:
55+
```
56+
=== <filename> ===
57+
<comparison info>
58+
59+
```
60+
61+
For text files, it will be the ouput of posix diff, which is just an empty string when identical:
62+
```
63+
...
64+
65+
=== field_table ===
66+
67+
68+
=== input.nml ===
69+
310,313c310,313
70+
< FNGLAC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/develop/fix/fix_am/global_glacier.2x2.grb'
71+
< FNMXIC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/develop/fix/fix_am/global_maxice.2x2.grb'
72+
< FNTSFC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/develop/fix/fix_am/RTGSST.1982.2012.monthly.clim.grb'
73+
< FNSNOC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/develop/fix/fix_am/global_snoclim.1.875.grb'
74+
---
75+
> FNGLAC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/add_preamble/fix/fix_am/global_glacier.2x2.grb'
76+
> FNMXIC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/add_preamble/fix/fix_am/global_maxice.2x2.grb'
77+
> FNTSFC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/add_preamble/fix/fix_am/RTGSST.1982.2012.monthly.clim.grb'
78+
> FNSNOC = '/scratch2/NCEPDEV/ensemble/save/Walter.Kolczynski/global-workflow/add_preamble/fix/fix_am/global_snoclim.1.875.grb'
79+
80+
...
81+
```
82+
(Text diffs have two extra blank line to separate the output.)
83+
84+
Grib files will look like this if they are identical:
85+
```
86+
=== GFSFLX.GrbF00 ===
87+
All fields are identical!
88+
=== GFSFLX.GrbF03 ===
89+
All fields are identical!
90+
=== GFSFLX.GrbF06 ===
91+
All fields are identical!
92+
=== GFSFLX.GrbF09 ===
93+
All fields are identical!
94+
=== GFSFLX.GrbF12 ===
95+
All fields are identical!
96+
97+
...
98+
99+
```
100+
101+
And NetCDFs will look like this:
102+
```
103+
=== atmf000.nc ===
104+
0 differences found
105+
=== atmf003.nc ===
106+
0 differences found
107+
=== atmf006.nc ===
108+
0 differences found
109+
=== atmf009.nc ===
110+
0 differences found
111+
112+
...
113+
```
114+
115+
If any variables in a grib or NetCDF do not match, they will be listed instead.

test/coordinates.lst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
grid_xt
2+
grid_yt
3+
lat
4+
lon
5+
pfull
6+
phalf
7+
time
8+
time_iso

test/diff_ROTDIR.sh

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
#! /bin/env bash
2+
3+
#
4+
# Differences relevant output files in two different experiment ROTDIRs.
5+
# Text files are compared via posix diff. GRiB files are compared via
6+
# correlation reported by wgrib2. NetCDF files are compared by using
7+
# NetCDF operators to calculate a diff then make sure all non-coordinate
8+
# variable differences are zero. File lists are created by globbing key
9+
# directories under the first experiment given.
10+
#
11+
# Syntax:
12+
# diff_ROTDIR.sh [-c coord_file][-h] rotdir cdate expA expB
13+
#
14+
# OR
15+
#
16+
# diff_ROTDIR.sh [-c coord_file][-h] dirA dirB
17+
#
18+
# Arguments:
19+
# rotdir: root rotdir where ROTDIRS are held
20+
# cdate: experiment date/cycle in YYYYMMDDHH format
21+
# expA, expB: experiment ids (PSLOT) to compare
22+
#
23+
# dirA, dirB: full paths to the cycle directories to be compared
24+
# (${rotdir}/${exp}/gfs.${YYYYMMDD}/${cyc})
25+
#
26+
# Options:
27+
# -c coord_file: file containing a list of coordinate variables
28+
# -h: print usage message and exit
29+
#
30+
31+
set -eu
32+
33+
usage() {
34+
#
35+
# Print usage statement
36+
#
37+
echo <<- 'EOF'
38+
Differences relevant output files in two different experiment ROTDIRs.
39+
Text files are compared via posix diff. GRiB files are compared via
40+
correlation reported by wgrib2. NetCDF files are compared by using
41+
NetCDF operators to calculate a diff then make sure all non-coordinate
42+
variable differences are zero. File lists are created by globbing key
43+
directories under the first experiment given.
44+
45+
Syntax:
46+
diff_ROTDIR.sh [-c coord_file][-h] rotdir cdate expA expB
47+
48+
OR
49+
50+
diff_ROTDIR.sh [-c coord_file][-h] dirA dirB
51+
52+
Arguments:
53+
rotdir: root rotdir where ROTDIRS are held
54+
cdate: experiment date/cycle in YYYYMMDDHH format
55+
expA, expB: experiment ids (PSLOT) to compare
56+
57+
dirA, dirB: full paths to the cycle directories to be compared
58+
(${rotdir}/${exp}/gfs.${YYYYMMDD}/${cyc})
59+
60+
Options:
61+
-c coord_file: file containing a list of coordinate variables
62+
-h: print usage message and exit
63+
EOF
64+
}
65+
66+
while getopts ":c:h" option; do
67+
case "${option}" in
68+
c) coord_file=${OPTARG} ;;
69+
h) usage; exit 0 ;;
70+
*) echo "Unknown option ${option}"; exit 1 ;;
71+
esac
72+
done
73+
74+
num_args=$#
75+
case $num_args in
76+
2) # Direct directory paths
77+
dirA=$1
78+
dirB=$2
79+
;;
80+
4) # Derive directory paths
81+
rotdir=$1
82+
date=$2
83+
expA=$3
84+
expB=$4
85+
86+
YYYYMMDD=$(echo $date | cut -c1-8)
87+
cyc=$(echo $date | cut -c9-10)
88+
dirA="$rotdir/$expA/gfs.${YYYYMMDD}/${cyc}"
89+
dirB="$rotdir/$expB/gfs.${YYYYMMDD}/${cyc}"
90+
;;
91+
*) # Unknown option
92+
echo "${num_args} is not a valid number of arguments, use 2 or 4"
93+
usage
94+
exit 1
95+
;;
96+
esac
97+
98+
temp_file=".diff.nc"
99+
100+
# Contains a bunch of NetCDF Operator shortcuts (will load nco module)
101+
source ./netcdf_op_functions.sh
102+
source ./test_utils.sh
103+
104+
coord_file="${coord_file:-./coordinates.lst}"
105+
106+
## Text files
107+
files=""
108+
files="${files} atmos/input.nml" # This file will be different because of the fix paths
109+
files="${files} $(basename_list 'atmos/' "$dirA/atmos/storms.*" "$dirA/atmos/trak.*")"
110+
if [[ -d $dirA/ice ]]; then
111+
files="${files} ice/ice_in"
112+
fi
113+
if [[ -d $dirA/ocean ]]; then
114+
files="${files} ocean/MOM_input"
115+
fi
116+
# if [[ -d $dirA/wave ]]; then
117+
# files="${files} $(basename_list 'wave/station/' "$dirA/wave/station/*bull_tar")"
118+
# fi
119+
120+
for file in $files; do
121+
echo "=== ${file} ==="
122+
fileA="$dirA/$file"
123+
fileB="$dirB/$file"
124+
diff $fileA $fileB || :
125+
done
126+
127+
## GRiB files
128+
129+
module load wgrib2/2.0.8
130+
131+
files=""
132+
files="${files} $(basename_list 'atmos/' $dirA/atmos/*grb2* $dirA/atmos/*.flux.*)"
133+
if [[ -d $dirA/wave ]]; then
134+
files="${files} $(basename_list 'wave/gridded/' $dirA/wave/gridded/*.grib2)"
135+
fi
136+
if [[ -d $dirA/ocean ]]; then
137+
files="${files} $(basename_list 'ocean/' $dirA/ocean/*grb2)"
138+
fi
139+
140+
for file in $files; do
141+
echo "=== ${file} ==="
142+
fileA="$dirA/$file"
143+
fileB="$dirB/$file"
144+
./diff_grib_files.py $fileA $fileB
145+
done
146+
147+
## NetCDF Files
148+
files=""
149+
files="${files} $(basename_list 'atmos/' $dirA/atmos/*.nc)"
150+
if [[ -d $dirA/ice ]]; then
151+
files="${files} $(basename_list 'ice/' $dirA/ice/*.nc)"
152+
fi
153+
if [[ -d $dirA/ocean ]]; then
154+
files="${files} $(basename_list 'ocean/' $dirA/ocean/*.nc)"
155+
fi
156+
157+
for file in $files; do
158+
echo "=== ${file} ==="
159+
fileA="$dirA/$file"
160+
fileB="$dirB/$file"
161+
nccmp -q $fileA $fileB $coord_file
162+
done

test/diff_UFS_rundir.sh

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
#! /bin/env bash
2+
3+
#
4+
# Differences relevant output files in two UFS model directories. GRiB files
5+
# are compared via correlation reported by wgrib2. NetCDF files are compared
6+
# by using NetCDF operators to calculate a diff then make sure all non-
7+
# coordinate variable differences are zero.
8+
#
9+
# Syntax:
10+
# diff_UFS_rundir.sh [-c coord_file][-h] dirA dirB
11+
#
12+
# Arguments:
13+
# dirA, dirB: full paths to the UFS run directories to be compared
14+
#
15+
# Options:
16+
# -c coord_file: file containing a list of coordinate variables
17+
# -h: print usage message and exit
18+
#
19+
20+
set -eu
21+
22+
usage() {
23+
#
24+
# Print usage statement
25+
#
26+
echo <<- 'EOF'
27+
Differences relevant output files in two UFS model directories. GRiB files
28+
are compared via correlation reported by wgrib2. NetCDF files are compared
29+
by using NetCDF operators to calculate a diff then make sure all non-
30+
coordinate variable differences are zero.
31+
32+
Syntax:
33+
diff_UFS_rundir.sh [-c coord_file][-h] dirA dirB
34+
35+
Arguments:
36+
dirA, dirB: full paths to the UFS run directories to be compared
37+
38+
Options:
39+
-c coord_file: file containing a list of coordinate variables
40+
-h: print usage message and exit
41+
EOF
42+
}
43+
44+
while getopts ":c:h" option; do
45+
case "${option}" in
46+
c) coord_file=${OPTARG} ;;
47+
h) usage; exit 0 ;;
48+
*) echo "Unknown option ${option}"; exit 1 ;;
49+
esac
50+
done
51+
52+
num_args=$#
53+
case $num_args in
54+
2) # Direct directory paths
55+
dirA=$1
56+
dirB=$2
57+
;;
58+
*) # Unknown option
59+
echo "${num_args} is not a valid number of arguments, use 2"
60+
usage
61+
exit 1
62+
;;
63+
esac
64+
65+
source ./netcdf_op_functions.sh
66+
source ./test_utils.sh
67+
68+
temp_file=".diff.nc"
69+
coord_file="${coord_file:-./coordinates.lst}"
70+
71+
# Input files
72+
files="data_table diag_table fd_nems.yaml field_table ice_in input.nml med_modelio.nml \
73+
model_configure nems.configure pio_in ww3_multi.inp ww3_shel.inp"
74+
75+
for file in $files; do
76+
echo "=== ${file} ==="
77+
fileA="$dirA/$file"
78+
fileB="$dirB/$file"
79+
if [[ -f "$fileA" ]]; then
80+
diff $fileA $fileB || :
81+
else
82+
echo ; echo;
83+
done
84+
85+
# GRiB files
86+
files="$(basename_list '' $dirA/GFSFLX.Grb*)"
87+
88+
module load wgrib2/2.0.8
89+
90+
for file in $files; do
91+
echo "=== ${file} ==="
92+
fileA="$dirA/$file"
93+
fileB="$dirB/$file"
94+
./diff_grib_files.py $fileA $fileB
95+
done
96+
97+
# NetCDF Files
98+
files=""
99+
files="${files} $(basename_list '' $dirA/atmf*.nc $dirA/sfcf*.nc)"
100+
if [[ -d "$dirA/history" ]]; then
101+
files="$(basename_list 'history/' $dirA/history/*.nc)"
102+
fi
103+
104+
for file in $files; do
105+
echo "=== ${file} ==="
106+
fileA="$dirA/$file"
107+
fileB="$dirB/$file"
108+
nccmp -q $fileA $fileB $coord_file
109+
done
110+

0 commit comments

Comments
 (0)