-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Macros fixes and diag_type fix #396
Conversation
I will post test results when complete. I have already run the suite a few times on cheyenne, just doing a final test. I expect everything to pass but answers to change for threaded cases. Most machine unaffected. I will also merge this on Friday afternoon. This is a significant bug that we need to fix asap, especially with the user meeting coming up and needing to work on cheyenne. I want this in the weekend testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add the documentation for ICE_RUNLENGTH to user_guide/ug_case_settings.rst
The default is -1, which isn't shown in the table, but is 15 minutes. The value of zero should be 29 minutes. (see cice.batch.csh)
I think all the ICE_IOTYPE stuff is accurate in the documentation now with these fixes.
Test results are in |
I just updated the documentation. I think it's as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All changes look good to me, but I think we could take the opportunity to clean up the unneeded logic in the Macros files I mentioned (that's my error: the 4 machines are our ECCC machines and when I created the Macros I didn't understand the build process as well as I do now).
Regarding #344 , I did some tests and the following change to cice.run.setup.csh seem to correctly grep the diag_file :
diff --git i/configuration/scripts/cice.run.setup.csh w/configuration/scripts/cice.run.setup.csh
index 8030324..80863c7 100755
--- i/configuration/scripts/cice.run.setup.csh
+++ w/configuration/scripts/cice.run.setup.csh
@@ -47,6 +47,11 @@ setenv OMP_NUM_THREADS ${nthrds}
cp -f \${ICE_CASEDIR}/ice_in \${ICE_RUNDIR}
set diagtype = \`grep -i diag_type \${ICE_CASEDIR}/ice_in | grep -i stdout | wc -l\`
+set diagfile = \`grep -i diag_file \${ICE_CASEDIR}/ice_in | sed -e "s/.* = '\(.*\)'/\1/"\`
+
+# Remove any existing diag_file from a previous run
+if ( \${diagtype} == 0) then
+ rm -f \${diagfile}
+endif
echo " "
echo "CICE rundir is \${ICE_RUNDIR}"
@@ -76,17 +81,16 @@ if !(-d \${ICE_LOGDIR}) mkdir -p \${ICE_LOGDIR}
cp -p \${ICE_RUNLOG_FILE} \${ICE_LOGDIR}
if ( \${diagtype} > 0) then
- grep ' CICE COMPLETED SUCCESSFULLY' \${ICE_RUNLOG_FILE}
- if ( \$status != 0 ) then
- echo "CICE run did not complete - see \${ICE_LOGDIR}/\${ICE_RUNLOG_FILE}"
- echo "\`date\` \${0}: \${ICE_CASENAME} run did NOT complete \${ICE_RUNLOG_FILE}" >> \${ICE_CASEDIR}/README.case
- exit -1
- endif
+ set success_file = \${ICE_RUNLOG_FILE}
else
- echo "\`date\` \${0}: \${ICE_CASENAME} run completed "
- echo "CICE NOT run with diag_type='stdout', run status unknown"
- echo "\`date\` \${0}: \${ICE_CASENAME} run completed " >> \${ICE_CASEDIR}/README.case
- echo "\`date\` \${0}: \${ICE_CASENAME} NOT run with diag_type='stdout', run status unknown" >> \${ICE_CASEDIR}/README.case
+ set success_file = \${diagfile}
+endif
+
+grep ' CICE COMPLETED SUCCESSFULLY' \${success_file}
+
+if ( \$status != 0 ) then
+ echo "CICE run did not complete - see \${ICE_LOGDIR}/\${ICE_RUNLOG_FILE}"
+ echo "\`date\` \${0}: \${ICE_CASENAME} run did NOT complete \${ICE_RUNLOG_FILE}" >> \${ICE_CASEDIR}/README.case
exit -1
endif
but I don't feel strongly about this, as long as the path to the log file is easily copy-paste-able.
ifeq ($(ICE_IOTYPE), netcdf) | ||
CPPDEFS := $(CPPDEFS) -Dncdf | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized this conditional can simply be deleted as -Dncdf
is set in cice.build !
|
||
CPPDEFS := $(CPPDEFS) -Dncdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized these lines can simply be deleted as -Dncdf
is set in cice.build !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CICE/configuration/scripts/cice.build
Lines 120 to 128 in b5134ad
if (${ICE_IOTYPE} == 'netcdf') then | |
set IODIR = io_netcdf | |
setenv ICE_CPPDEFS "${ICE_CPPDEFS} -Dncdf" | |
else if (${ICE_IOTYPE} == 'pio') then | |
set IODIR = io_pio | |
setenv ICE_CPPDEFS "${ICE_CPPDEFS} -Dncdf" | |
else | |
set IODIR = io_binary | |
endif |
ifeq ($(ICE_IOTYPE), netcdf) | ||
CPPDEFS := $(CPPDEFS) -Dncdf | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing, these lines can be deleted
|
||
CPPDEFS := $(CPPDEFS) -Dncdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing, these lines can be deleted
|
||
CPPDEFS := $(CPPDEFS) -Dncdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing, these lines can be deleted
ifeq ($(ICE_IOTYPE), netcdf) | ||
CPPDEFS := $(CPPDEFS) -Dncdf | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing, these lines can be deleted
|
||
CPPDEFS := $(CPPDEFS) -Dncdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing, these lines can be deleted
ifeq ($(ICE_IOTYPE), netcdf) | ||
CPPDEFS := $(CPPDEFS) -Dncdf | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing, these lines can be deleted
endif | ||
else | ||
echo "\`date\` \${0}: \${ICE_CASENAME} run completed " | ||
echo "CICE NOT run with diag_type='stdout', run status unknown" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo "CICE NOT run with diag_type='stdout', run status unknown" | |
echo "CICE NOT run with diag_type='stdout', run status unknown - see \${ICE_LOGDIR}/\${ICE_RUNLOG_FILE}" |
I think in that case it would be nice to also output the path to the log file for easy (<middle click path> + vi <middle click>
) access
Thanks @phil-blain, several good ideas there. I'll try to clean up some of that. One thing is that if the diag_type is not set to stdout, it really needs to return a negative exit code. We could grab the filename from ice_in and check whether the run successfully completed, but all the rest of the stuff that is part of the test script cannot work unless diag_type is stdout. That could be fixed too, but that would be a bunch more work. Plus I don't see the need. All testing can be done with "stdout". I would also argue that using diag_type='file' is not a great idea in general. As you point out, it will continually overwrite (or append) the prior output file. The current implementation, leveraging a date stamp in the ice log file set by the scripts ensures each output file is a unique name which I think is a nice feature. Finally, I think I figured out the performance issue with threading on cheyenne. We need omplace on job launch. I'm testing that too. I should have some updated commits soon. |
Yes I agree with all these considerations. That's why I don't feel too strongly about the diag_file one; I only use it for cases and not test/suites. So I'm glad with the changes you're proposing in this PR, plus outputting the path to the log file in the diag_file case. |
Just pushed an update. cheyenne timing is much better now. https://github.com/CICE-Consortium/Test-Results/wiki/af3d93e40c.cheyenne.intel.20-01-17.183453. All tests pass and most are faster. Many regression comparisons fail (as expected). Also cleaned up the scripts a bit and update the run checking as guided above. Would like to merge the PR today if possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, this does not affect how a user interacts with the scripts. Have you tested that the success/fail changes all work as expected? If you are all happy with these changes, then I'm good with them too. Thanks for fixing this!
@eclare108213, usage doesn't change. I have tested the updated scripts with diag_type = stdout and diag_type = file and it seems to work as designed. |
Not sure why readthedocs is failing right now. Have tried to run it twice. It was fine before and my recent changes did not affect documentation. I think there is a problem with readthedocs, will have to see whether it's temporary or something that we'll need to fix. Here is the version from #2078b884 which was after my doc change, https://readthedocs.org/projects/cice-consortium-cice/builds/10275392/ |
Read the docs is working again. I am going to merge this PR now. |
These two files used outdated environment variables for threading ("compile_threaded") and PIO ("IO_TYPE"). Replace these with the correct variables, ICE_THREADED and ICE_IOTYPE. Mirrors CICE-Consortium/CICE#396
These two files used outdated environment variables for threading ("compile_threaded") and PIO ("IO_TYPE"). Replace these with the correct variables, ICE_THREADED and ICE_IOTYPE. Mirrors CICE-Consortium/CICE#396
These two files used outdated environment variables for threading ("compile_threaded") and PIO ("IO_TYPE"). Replace these with the correct variables, ICE_THREADED and ICE_IOTYPE. Mirrors CICE-Consortium/CICE#396
#296) * machines: add conda environment for macOS and Linux Closes #294 * Makefile: remove uneeded (and faulty) logic around CFLAGS_HOST The CFLAG_HOST variable was added in #257, and some Makefile logic was added to define the variable to be blank if the included Macros file does not define it. However, the Make syntax is wrong, as 'ifndef' takes a variable name and not a reference to a variable [1], so ifndef $(CFLAGS_HOST) should have been written ifndef CFLAGS_HOST The effect of this error is to invert the logic of the check (!) since if CFLAGS_HOST is defined, Make will check if a variable with name equal to whatever CFLAGS_HOST is defined to be exists, which will most probably be false, and the conditional will then evaluate to true and CFLAG_HOSTS will be redefined to be blank, defeating the whole purpose of the flag. Since there's no harm in Make referencing and empty variable, fix this by just removing the conditional check. [1] https://www.gnu.org/software/make/manual/html_node/Conditional-Syntax.html * doc: add documentation for conda environment * machines: fix Macros file for cheyenne and testmachine These two files used outdated environment variables for threading ("compile_threaded") and PIO ("IO_TYPE"). Replace these with the correct variables, ICE_THREADED and ICE_IOTYPE. Mirrors CICE-Consortium/CICE#396 * scripts: implement '-nomodules' logic for env files Some env files wrap the calls for loading the build and run environment in a conditional based on the argument `-nomodules`, so that scripts that source the env file just to get the variables defined in it do not run slower because they have to load the environment. This logic was copied from CICE but was not implemented in the Icepack scripts icepack.setup and setup_run_dirs.csh. Add the argument '-nomodules' so that the logic becomes effective. * doc: conda: add note about csh/tcsh On Ubuntu and its derivatives, the csh shell at /bin/csh, which is used in the shebangs of the Icepack scripts, is not compatible with conda. Mention that in the documentation and explain how to install tcsh as an alternative, and configure the system such that /bin/csh points to /bin/tcsh.
PR checklist
Fix Macros files, threading was not active on a few platforms including cheyenne.
Update run status output based on diag_type setting
apcraig
Full test suite on cheyenne, will post results when done
For threaded tests, answers change on some platforms, namely cheyenne
Surprised we didn't find this error sooner!
Changes answers for some threaded cases. Machines affected are
Other Macros changes don't affect answers.
Partly address CICE #344