Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add/fix build capability for Gaea-C5, Gaea-C6, and container #800

Merged
merged 13 commits into from
Nov 12, 2024
Merged
Prev Previous commit
Next Next commit
update params for rrfs and add FI_VERBS_PREFER_XRC=0
  • Loading branch information
DavidBurrows-NCO committed Nov 7, 2024
commit d11239f3aea2d02552ef307067e17fef87298136
28 changes: 14 additions & 14 deletions regression/regression_param.sh
Original file line number Diff line number Diff line change
Expand Up @@ -112,11 +112,11 @@ case $regtest in
topts[1]="0:15:00" ; popts[1]="5/4/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="10/4/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac5" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="128/2/" ; ropts[2]="/1"
topts[1]="0:15:00" ; popts[1]="40/3/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="40/5/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac6" ]]; then
topts[1]="0:60:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:60:00" ; popts[2]="128/2/" ; ropts[2]="/1"
topts[1]="0:60:00" ; popts[1]="40/3/" ; ropts[1]="/1"
RussTreadon-NOAA marked this conversation as resolved.
Show resolved Hide resolved
topts[2]="0:60:00" ; popts[2]="40/5/" ; ropts[2]="/1"
elif [[ "$machine" = "wcoss2" || "$machine" = "acorn" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="128/2/" ; ropts[2]="/1"
Expand Down Expand Up @@ -145,8 +145,8 @@ case $regtest in
topts[1]="0:15:00" ; popts[1]="5/4/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="10/4/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac5" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="128/2/" ; ropts[2]="/1"
topts[1]="0:15:00" ; popts[1]="32/2/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="64/4/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac6" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="128/2/" ; ropts[2]="/1"
Expand Down Expand Up @@ -177,8 +177,8 @@ case $regtest in
topts[1]="0:15:00" ; popts[1]="5/4/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="10/4/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac5" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="128/2/" ; ropts[2]="/1"
topts[1]="0:15:00" ; popts[1]="32/2/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="64/4/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac6" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="128/2/" ; ropts[2]="/1"
Expand Down Expand Up @@ -210,11 +210,11 @@ case $regtest in
topts[1]="0:15:00" ; popts[1]="4/4/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="6/6/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac5" ]]; then
topts[1]="0:15:00" ; popts[1]="28/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="28/2/" ; ropts[2]="/1"
topts[1]="0:15:00" ; popts[1]="40/2/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="40/4/" ; ropts[2]="/1"
elif [[ "$machine" = "gaeac6" ]]; then
topts[1]="0:15:00" ; popts[1]="28/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="28/2/" ; ropts[2]="/1"
topts[1]="0:15:00" ; popts[1]="40/2/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="40/4/" ; ropts[2]="/1"
elif [[ "$machine" = "wcoss2" || "$machine" = "acorn" ]]; then
topts[1]="0:15:00" ; popts[1]="64/1/" ; ropts[1]="/1"
topts[2]="0:15:00" ; popts[2]="64/2/" ; ropts[2]="/1"
Expand Down Expand Up @@ -276,8 +276,8 @@ case $regtest in
topts[1]="0:10:00" ; popts[1]="12/3/" ; ropts[1]="/1"
topts[2]="0:10:00" ; popts[2]="12/5/" ; ropts[2]="/2"
elif [[ "$machine" = "gaeac5" ]]; then
topts[1]="0:10:00" ; popts[1]="16/2/" ; ropts[1]="/1"
topts[2]="0:10:00" ; popts[2]="16/4/" ; ropts[2]="/2"
topts[1]="0:10:00" ; popts[1]="12/3/" ; ropts[1]="/1"
topts[2]="0:10:00" ; popts[2]="12/5/" ; ropts[2]="/2"
elif [[ "$machine" = "gaeac6" ]]; then
topts[1]="0:10:00" ; popts[1]="16/2/" ; ropts[1]="/1"
topts[2]="0:10:00" ; popts[2]="16/4/" ; ropts[2]="/2"
Expand Down
1 change: 1 addition & 0 deletions ush/sub_gaeac5
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ sbatch=${sbatch:-sbatch}
ofile=$DATA/subout$$
>$ofile
chmod 777 $ofile
export FI_VERBS_PREFER_XRC=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this setting resolve what appears to be mpi_finalize problems on C5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this setting resolve what appears to be mpi_finalize problems on C5?

It appears so. Here is the notice from Seth Underwood with Gaea C5: "After the C5 update, users reported that some jobs failed during the MPI_Finalize call. We have alerted ORNL and HPE. HPE has suggested setting the environment variable FI_VERBS_PREFER_XRC=0 in the run script (setenv FI_VERBS_PREFER_XRC 0, for csh; export FI_VERBS_PREFER_XRC=0). This has resolved the error in our tests. Please add this variable to your run script(s) if you also hit this error. Please note that we do not see any issues preemptively setting this environment variable."

Now that I think the MPI_Finalize issue is resolved, I am going to adjust the resources and test a little more. I'll let you know when I have my final changes in place for you to look over.

$sbatch $cfile >$ofile
rc=$?
cat $ofile
Expand Down
Loading