-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update icepack and fix 2 failing tests #163
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the icepack submodule update be here?
Is there a way for the scripts to recognize when there aren't enough processors available for a given test, and gracefully skip it with a message to that effect?
My intent with this PR was mainly to update the icepack submodule, so that is correct. I also fixed a few other issues in CICE that deal with some test problem. Regarding skipping tests with not enough resources. It's probably possible, but we'd have to provide some info to the scripts about how much resources each machine has. It wouldn't be too hard to implement. But the machine should reject jobs that are too large automatically too. Maybe we could add an issue to discuss this feature and how we might want it to behave. |
There's at least one file missing... cice.setup: ERROR, /turquoise/usr/projects/climate/eclare/CICE.v6.0/github/CICE.cicetestC/configuration/scripts/options/set_[nml,env].iobinary not found |
@eclare108213 thanks for catching that, just added the missing files. |
I ran the base_suite, which passed everything except for the 40-pe run, which I expected not to work because the queue I use won't allow that. It shows as pending in the test results, although the submission actually failed.
I didn't find a record of the failure in the base_suite output, though. I had to resubmit the job for just that case in order to get this info:
Maybe we could capture that information somehow? I don't know how important this is. |
Let me think about this issue. I think if a case cannot run, it probably shows up a "grey", ie "unknown" in the results. Does it prevent the report_results from running or create problems otherwise? We could try to have the submit script capture an error and report that to the test results. We could also do something like remove the larger jobs from the base_suite, so that suite would consist of only smaller and shorter jobs. We could then have another suite, bigjobs_suite, that would build and run some bigger (more pes, longer runs, larger grids) and that might only be run on hardware that can handle it. I think if the case returns and "unknown" result in the results and it doesn't get in the way of other features, then I think that's fine (maybe even correct). If that's not the case, then I'll look into fixing that. |
I think fixing the reporting/behavior of the unrunable job should be a separate issue maybe in a future PR. |
That's fine, we can address this issue separately. The pending test shows up as 'fail' on the test-results wiki: |
Update icepack version to current master, fix 2 failing tests, add binary restart test, update gordon intel compiler version
Developer(s): tcraig
Please suggest code Pull Request reviewers in the column at right.
Are the code changes bit for bit, different at roundoff level, or more substantial? bit-for-bit
Is the documentation being updated with this PR? (Y/N) N
If not, does the documentation need to be updated separately at a later time? (Y/N) N
Other Relevant Details:
This change is bit-for-bit, see full test results here for conrad, hash 8b44ba1,
https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks
There are still failing tests, but the boxrestore and medium test now run and validate. The gx1 needs a mod in icepack which will be coming soon. Add binary restart test, need to update the validation process of multiple binary restart files. Update the gordon intel compiler version to be consistent with conrad, was causing some test failures.