Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate update of consistency test baseline data #574

Closed
GeorgeGayno-NOAA opened this issue Aug 24, 2021 · 6 comments · Fixed by #603
Closed

Automate update of consistency test baseline data #574

GeorgeGayno-NOAA opened this issue Aug 24, 2021 · 6 comments · Fixed by #603
Assignees
Labels
enhancement New feature or request

Comments

@GeorgeGayno-NOAA
Copy link
Collaborator

Code updates often change results, which means the consistency test baseline data must be updated. Currently, updates are done manually on five machines. But as the number of weekly PRs grows, we need an automated way to do this.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

I think @junwang-noaa has done this for the ufs-weather-model.

@GeorgeGayno-NOAA GeorgeGayno-NOAA added the enhancement New feature or request label Aug 24, 2021
@kgerheiser
Copy link
Contributor

Wouldn't be too difficult to port this from the weather model:

https://github.com/ufs-community/ufs-weather-model/tree/develop/tests/auto

You can add a "run test" label to a PR and then the script will read it from Github and submit the job(s).

@BrianCurtis-NOAA
Copy link

BrianCurtis-NOAA commented Sep 27, 2021

@GeorgeGayno-NOAA Are you using the same systems that we are with UFS? I haven't ported it to WCOSS machines (yet, I think WCOSS2 will have pygithub package) because Mars/Luna/Surge/Venus didn't have pygithub.

The labels we've setup in GitHub are <machine>-<compiler>-<job>, if the <machine> matches the $HOSTNAME (with a wildcard search), it will start the <job> (from jobs/<job>.py) using the <compiler> you specify. The machines use a cronjob to check for open github PR's. It's easy to specify a different repo, the current one is just hard-coded.

There is still a level of manual work for the cases where a machine kills a job or it times out etc.. but the scripts should post in the PR with any issues that arise so someone can go take a look. For UFS the jobs submits the log from the machine as the signal that all went well because it uses the log file to determine if all jobs were successful or if even one failed.

Hopefully with a second group interested in a similar work flow, we can improve upon the current code.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA Are you using the same systems that we are with UFS? I haven't ported it to WCOSS machines (yet, I think WCOSS2 will have pygithub package) because Mars/Luna/Surge/Venus didn't have pygithub.

The labels we've setup in GitHub are --, if the matches the $HOSTNAME (with a wildcard search), it will start the (from jobs/.py) using the you specify. The machines use a cronjob to check for open github PR's. It's easy to specify a different repo, the current one is just hard-coded.

There is still a level of manual work for the cases where a machine kills a job or it times out etc.. but the scripts should post in the PR with any issues that arise so someone can go take a look. For UFS the jobs submits the log from the machine as the signal that all went well because it uses the log file to determine if all jobs were successful or if even one failed.

Hopefully with a second group interested in a similar work flow, we can improve upon the current code.

We run on WCOSS and Hera, Jet and Orion.

We already run our tests off the cron. But I would be interested in how you do that. Also, how do you update the baseline data when updates change results?

@BrianCurtis-NOAA
Copy link

BrianCurtis-NOAA commented Sep 27, 2021

how do you update the baseline data when updates change results?
With the UFSWM if we know updates change the results we create baselines (BL job) which automatically calls the regression tests (RT job) after successful completion. New baselines are created --> script checks log files for errors --> baselines are moved to where we keep baselines --> RT's are run against new baselines.

@BrianCurtis-NOAA
Copy link

BrianCurtis-NOAA commented Sep 27, 2021

Cronjob for UFSWM (Orion example):

# Automated Regression Testing
MAILTO="brian.curtis@noaa.gov"
*/15 * * * * cd /work/noaa/nems/emc.nemspara/autort/tests/auto && /bin/bash --login start_rt_auto.sh >> rt_auto.out 2>&1

The start_rt_auto.sh script loads PYTHONPATH into $PATH and calls rt_auto.py
rt_auto.py (if machine matches label) gets all information for all jobs from GitHub and the machine, stores it into an object and passes it into the jobs.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 13, 2021
@GeorgeGayno-NOAA GeorgeGayno-NOAA self-assigned this Oct 14, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 14, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 14, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 14, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 15, 2021
Update the update_baseline.sh script to process the
fix_sfc baseline subdirectory used by the grid_gen.

Fixes ufs-community#574.
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 18, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Oct 22, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Nov 19, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Dec 17, 2021
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 11, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 12, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 12, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 12, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 12, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 12, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 13, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 13, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 13, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 13, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 14, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 14, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 14, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jan 14, 2022
GeorgeGayno-NOAA added a commit that referenced this issue Jan 20, 2022
Add logic to the consistency test scripts to automatically update the baseline data
when code updates change results.

Fixes #574
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants