Make `pathogen-repo-ci` fail when config is missing or no builds are attempted #95

genehack · 2024-06-11T16:50:24Z

Description of proposed changes

Make the CI fail if it doesn't try to run at least one build step, as that's an indication that somebody is trying to use "modern" pathogen-repo-ci on an un-modernized pathogen repo (or there's some other grievous misconfig happening).

Related issue(s)

#92

Checklist

Clean up CI bits in this repo that break
Checks pass
Verify a repo without pathogen-nextstrain.yaml fails as expected proof
Verify a repo that has pathogen-nexstrain.yaml but lacking required files for all build steps fails as expected proof
Verify a repo that has one step that attempts to run succeeds as expected proof
Verify a repo that runs two steps still succeeds as expected proof
~~Verify a repo that runs all three steps still succeeds as expected~~ — skipping because I don't think there's a repo handy with a nextclade build that's also using pathogen-repo-ci(?)

genehack · 2024-06-11T17:32:41Z

The one remaining thing here is figuring out what to do with CI / test-pathogen-repo-ci-no-example-data, which is pointed at zika-tutorial. Specifically, this job:

  test-pathogen-repo-ci-no-example-data:
    uses: ./.github/workflows/pathogen-repo-ci.yaml
    with:
      repo: nextstrain/zika-tutorial
      artifact-name: outputs-test-pathogen-repo-ci-no-example-data

options:

pin this job to the older workflow — feels bad, not sure what that proves
remove this job — would need somebody to confirm that's okay; i don't understand what this job is trying to verify from a CI perspective
point the job at a different repo — again, requires an explanation of the intent of this job in the CI workflow for the repo

@joverlee521 @tsibley could either of you shed insight on the "what is this job for" question?

tsibley · 2024-06-11T22:00:47Z

@joverlee521 @tsibley could either of you shed insight on the "what is this job for" question?

Ah, I tried to answer that previously, but sorry there's still a disconnect here. In retrospect, I think I assumed too much context. I think the commit which introduces the job, 9800e33, is a clear and concise explanation with the right context, but if it's still not I can try again.

genehack · 2024-06-11T23:02:18Z

@joverlee521 @tsibley could either of you shed insight on the "what is this job for" question?

Ah, I tried to answer that previously, but sorry there's still a disconnect here. In retrospect, I think I assumed too much context. I think the commit which introduces the job, 9800e33, is a clear and concise explanation with the right context, but if it's still not I can try again.

So, based on the context in that commit message, I think the right thing to do here is just remove this job from the CI? The thing that it's validating (don't need to set up if example data doesn't exist) is something that the "modern" pathogen-repo-ci doesn't worry about. The assumption is if there's a Snakefile and a build-config/ci/config.yaml, it's sufficient to call the build with that config and it will DTRT.

ETA: it's removed

genehack · 2024-06-11T23:37:59Z

This is updated and seems to be working; will try to merge this EOD Wednesday (PT), if not earlier.

tsibley

Looks ok by inspection. A few nits.

My biggest suggestion is that we don't actually run the "Run {ingest, phylogenetic, nextclade}" steps if they're not going to do anything. This means that instead of checking for whether to do anything or not inside each step individually, we'd change to checking the conditions for the steps outside first (in one fell swoop). Then we'd add if: … conditionals to the steps. This will make the logs a lot clearer and mean the GitHub workflow metadata stays more in sync with reality. I'll try this out in a separate PR. I don't think it has to block this one.

.github/workflows/pathogen-repo-ci.yaml

genehack · 2024-06-12T18:49:13Z

My biggest suggestion is that we don't actually run the "Run {ingest, phylogenetic, nextclade}" steps if they're not going to do anything.

The advantage I see to doing the file existence checks inside the action is that it can report out exactly which file(s) aren't present. Maybe you can do that with some sort of if: expression — my GitHub Actions fu remains weak — and that would be fine, but having it all sort of collected in the one spot makes it easier (at least for me) to understand.

genehack · 2024-06-12T18:52:27Z

minor nits have been picked.

… to do anything Instead of checking for whether to do anything or not _inside_ each build step individually, move the check to the step's condition. This will make the logs a lot clearer and mean the GitHub workflow metadata stays more in sync with reality. I've used hashFiles() to check for file existence—it returns the empty string, a falsey value, when there are no matching files—but we could swap that out for using the output of a prior setup step that runs some shell to determine what to run and what to skip. Doing so might make more sense if the conditional becomes more complicated or we want to do more detailed reporting on _why_ steps were skipped or not. With the changes, the internal-to-this-workflow action, run-nextstrain-ci-build, is no longer that useful and so I've removed it in favor of inlining things. I think this is an improvement for readability. Related-to: <#95 (review)>

tsibley · 2024-06-12T20:57:41Z

Take a gander at #96 and see what you think?

The advantage I see to doing the file existence checks inside the action is that it can report out exactly which file(s) aren't present.

Nod. Reporting that out is still possible, but I didn't bother in the PR above. It seems like a minor thing to me: if a step is skipped unexpectedly, it's easy enough to grab the commit id checked out from the logs and look at what files exist in it? But we could still maintain that reporting if desired.

joverlee521

Thanks for the work here @genehack! I left a minor suggestion, but the changes here LGTM!

joverlee521 · 2024-06-12T21:21:33Z

.github/workflows/pathogen-repo-ci.yaml

+          echo "INGEST ATTEMPTED=${{ steps.ingest.outputs.run-attempted }}"
+          echo "PHYLOGENETIC ATTEMPTED=${{ steps.phylogenetic.outputs.run-attempted }}"
+          echo "NEXTCLADE ATTEMPTED=${{ steps.nextclade.outputs.run-attempted }}"


Minor suggestion to output these to the $GITHUB_STEP_SUMMARY so that it's easy to scan the workflow attempts from the GH Action summary instead of having to dig into the job logs.

pushed an update — was that what you were thinking of?

Yup, that's what I was thinking, except small typo $"GITHUB_STEP_SUMMARY" -> "$GITHUB_STEP_SUMMARY"

>_< thanks -- pushed a fix, gonna merge this once CI finishes

* Add ids to build steps in `pathogen-repo-ci` * Add `run-attempted` output to `run-nextstrain-ci-build` * Set output to true or false depending on whether build was attempted * Add step to `pathogen-repo-ci` to read outputs from `run-nextstrain-ci-build` and validate at least one build was tried

This is testing an outdated scenario — the `pathogen-repo-ci` workflow no longer cares whether you have example data or not; only that you have a Snakefile and a CI-specific build config in the right spot.

genehack · 2024-06-13T17:44:07Z

Take a gander at #96 and see what you think?

I went ahead and merged this; I also left you a question on #96 — maybe we can continue discussion over there? I'm largely fine with your changes, just wanting to understand the template bits.

… to do anything Instead of checking for whether to do anything or not _inside_ each build step individually, move the check to the step's condition. This will make the logs a lot clearer and mean the GitHub workflow metadata stays more in sync with reality. I've used hashFiles() to check for file existence—it returns the empty string, a falsey value, when there are no matching files—but we could swap that out for using the output of a prior setup step that runs some shell to determine what to run and what to skip. Doing so might make more sense if the conditional becomes more complicated or we want to do more detailed reporting on _why_ steps were skipped or not. With the changes, the internal-to-this-workflow action, run-nextstrain-ci-build, is no longer that useful and so I've removed it in favor of inlining things. I think this is an improvement for readability. Related-to: <#95 (review)>

genehack force-pushed the bring-tha-noise-92 branch 6 times, most recently from 82233df to 6b2a3c0 Compare June 11, 2024 17:19

genehack marked this pull request as ready for review June 11, 2024 17:32

genehack requested a review from a team June 11, 2024 17:33

tsibley approved these changes Jun 12, 2024

View reviewed changes

Fail fast when nextstrain-pathogen.yaml file is missing [#92]

175d648

genehack force-pushed the bring-tha-noise-92 branch from e028f39 to 422bff1 Compare June 12, 2024 18:51

tsibley mentioned this pull request Jun 12, 2024

pathogen-repo-ci: Don't run workflow build steps if they're not going to do anything #96

Merged

5 tasks

tsibley approved these changes Jun 12, 2024

View reviewed changes

joverlee521 approved these changes Jun 12, 2024

View reviewed changes

genehack force-pushed the bring-tha-noise-92 branch 2 times, most recently from 1bad99e to df99e08 Compare June 13, 2024 00:36

genehack added 2 commits June 13, 2024 10:13

Remove test-pathogen-repo-ci-no-example-data job [#92]

fea28cb

This is testing an outdated scenario — the `pathogen-repo-ci` workflow no longer cares whether you have example data or not; only that you have a Snakefile and a CI-specific build config in the right spot.

genehack force-pushed the bring-tha-noise-92 branch from df99e08 to fea28cb Compare June 13, 2024 17:13

genehack merged commit 116404e into master Jun 13, 2024
36 checks passed

genehack deleted the bring-tha-noise-92 branch June 13, 2024 17:43

joverlee521 mentioned this pull request Jun 18, 2024

Should the pathogen-repo-ci be louder if no workflows are run? #92

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `pathogen-repo-ci` fail when config is missing or no builds are attempted #95

Make `pathogen-repo-ci` fail when config is missing or no builds are attempted #95

genehack commented Jun 11, 2024 •

edited by tsibley

Loading

genehack commented Jun 11, 2024

tsibley commented Jun 11, 2024

genehack commented Jun 11, 2024 •

edited

Loading

genehack commented Jun 11, 2024

tsibley left a comment

genehack commented Jun 12, 2024

genehack commented Jun 12, 2024

tsibley commented Jun 12, 2024

joverlee521 left a comment

joverlee521 Jun 12, 2024

genehack Jun 13, 2024

joverlee521 Jun 13, 2024

genehack Jun 13, 2024 •

edited

Loading

joverlee521 Jun 13, 2024

genehack commented Jun 13, 2024

Make pathogen-repo-ci fail when config is missing or no builds are attempted #95

Make pathogen-repo-ci fail when config is missing or no builds are attempted #95

Conversation

genehack commented Jun 11, 2024 • edited by tsibley Loading

Description of proposed changes

Related issue(s)

Checklist

genehack commented Jun 11, 2024

tsibley commented Jun 11, 2024

genehack commented Jun 11, 2024 • edited Loading

genehack commented Jun 11, 2024

tsibley left a comment

Choose a reason for hiding this comment

genehack commented Jun 12, 2024

genehack commented Jun 12, 2024

tsibley commented Jun 12, 2024

joverlee521 left a comment

Choose a reason for hiding this comment

joverlee521 Jun 12, 2024

Choose a reason for hiding this comment

genehack Jun 13, 2024

Choose a reason for hiding this comment

joverlee521 Jun 13, 2024

Choose a reason for hiding this comment

genehack Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

joverlee521 Jun 13, 2024

Choose a reason for hiding this comment

genehack commented Jun 13, 2024

Make `pathogen-repo-ci` fail when config is missing or no builds are attempted #95

Make `pathogen-repo-ci` fail when config is missing or no builds are attempted #95

genehack commented Jun 11, 2024 •

edited by tsibley

Loading

genehack commented Jun 11, 2024 •

edited

Loading

genehack Jun 13, 2024 •

edited

Loading