Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pathogen-repo-ci fail when config is missing or no builds are attempted #95

Merged
merged 3 commits into from
Jun 13, 2024

Conversation

genehack
Copy link
Contributor

@genehack genehack commented Jun 11, 2024

Description of proposed changes

Make the CI fail if it doesn't try to run at least one build step, as that's an indication that somebody is trying to use "modern" pathogen-repo-ci on an un-modernized pathogen repo (or there's some other grievous misconfig happening).

Related issue(s)

#92

Checklist

  • Clean up CI bits in this repo that break
  • Checks pass
  • Verify a repo without pathogen-nextstrain.yaml fails as expected proof
  • Verify a repo that has pathogen-nexstrain.yaml but lacking required files for all build steps fails as expected proof
  • Verify a repo that has one step that attempts to run succeeds as expected proof
  • Verify a repo that runs two steps still succeeds as expected proof
  • Verify a repo that runs all three steps still succeeds as expected — skipping because I don't think there's a repo handy with a nextclade build that's also using pathogen-repo-ci(?)

@genehack genehack force-pushed the bring-tha-noise-92 branch 6 times, most recently from 82233df to 6b2a3c0 Compare June 11, 2024 17:19
@genehack
Copy link
Contributor Author

The one remaining thing here is figuring out what to do with CI / test-pathogen-repo-ci-no-example-data, which is pointed at zika-tutorial. Specifically, this job:

  test-pathogen-repo-ci-no-example-data:
    uses: ./.github/workflows/pathogen-repo-ci.yaml
    with:
      repo: nextstrain/zika-tutorial
      artifact-name: outputs-test-pathogen-repo-ci-no-example-data

options:

  1. pin this job to the older workflow — feels bad, not sure what that proves
  2. remove this job — would need somebody to confirm that's okay; i don't understand what this job is trying to verify from a CI perspective
  3. point the job at a different repo — again, requires an explanation of the intent of this job in the CI workflow for the repo

@joverlee521 @tsibley could either of you shed insight on the "what is this job for" question?

@genehack genehack marked this pull request as ready for review June 11, 2024 17:32
@genehack genehack requested a review from a team June 11, 2024 17:33
@tsibley
Copy link
Member

tsibley commented Jun 11, 2024

@joverlee521 @tsibley could either of you shed insight on the "what is this job for" question?

Ah, I tried to answer that previously, but sorry there's still a disconnect here. In retrospect, I think I assumed too much context. I think the commit which introduces the job, 9800e33, is a clear and concise explanation with the right context, but if it's still not I can try again.

@genehack
Copy link
Contributor Author

genehack commented Jun 11, 2024

@joverlee521 @tsibley could either of you shed insight on the "what is this job for" question?

Ah, I tried to answer that previously, but sorry there's still a disconnect here. In retrospect, I think I assumed too much context. I think the commit which introduces the job, 9800e33, is a clear and concise explanation with the right context, but if it's still not I can try again.

So, based on the context in that commit message, I think the right thing to do here is just remove this job from the CI? The thing that it's validating (don't need to set up if example data doesn't exist) is something that the "modern" pathogen-repo-ci doesn't worry about. The assumption is if there's a Snakefile and a build-config/ci/config.yaml, it's sufficient to call the build with that config and it will DTRT.

ETA: it's removed

@genehack
Copy link
Contributor Author

This is updated and seems to be working; will try to merge this EOD Wednesday (PT), if not earlier.

Copy link
Member

@tsibley tsibley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok by inspection. A few nits.

My biggest suggestion is that we don't actually run the "Run {ingest, phylogenetic, nextclade}" steps if they're not going to do anything. This means that instead of checking for whether to do anything or not inside each step individually, we'd change to checking the conditions for the steps outside first (in one fell swoop). Then we'd add if: … conditionals to the steps. This will make the logs a lot clearer and mean the GitHub workflow metadata stays more in sync with reality. I'll try this out in a separate PR. I don't think it has to block this one.

.github/workflows/pathogen-repo-ci.yaml Outdated Show resolved Hide resolved
.github/workflows/pathogen-repo-ci.yaml Outdated Show resolved Hide resolved
.github/workflows/pathogen-repo-ci.yaml Outdated Show resolved Hide resolved
.github/workflows/pathogen-repo-ci.yaml Outdated Show resolved Hide resolved
@genehack
Copy link
Contributor Author

My biggest suggestion is that we don't actually run the "Run {ingest, phylogenetic, nextclade}" steps if they're not going to do anything.

The advantage I see to doing the file existence checks inside the action is that it can report out exactly which file(s) aren't present. Maybe you can do that with some sort of if: expression — my GitHub Actions fu remains weak — and that would be fine, but having it all sort of collected in the one spot makes it easier (at least for me) to understand.

@genehack
Copy link
Contributor Author

minor nits have been picked.

tsibley added a commit that referenced this pull request Jun 12, 2024
… to do anything

Instead of checking for whether to do anything or not _inside_ each
build step individually, move the check to the step's condition.  This
will make the logs a lot clearer and mean the GitHub workflow metadata
stays more in sync with reality.

I've used hashFiles() to check for file existence—it returns the empty
string, a falsey value, when there are no matching files—but we could
swap that out for using the output of a prior setup step that runs some
shell to determine what to run and what to skip.  Doing so might make
more sense if the conditional becomes more complicated or we want to do
more detailed reporting on _why_ steps were skipped or not.

With the changes, the internal-to-this-workflow action,
run-nextstrain-ci-build, is no longer that useful and so I've removed it
in favor of inlining things.  I think this is an improvement for
readability.

Related-to: <#95 (review)>
tsibley added a commit that referenced this pull request Jun 12, 2024
… to do anything

Instead of checking for whether to do anything or not _inside_ each
build step individually, move the check to the step's condition.  This
will make the logs a lot clearer and mean the GitHub workflow metadata
stays more in sync with reality.

I've used hashFiles() to check for file existence—it returns the empty
string, a falsey value, when there are no matching files—but we could
swap that out for using the output of a prior setup step that runs some
shell to determine what to run and what to skip.  Doing so might make
more sense if the conditional becomes more complicated or we want to do
more detailed reporting on _why_ steps were skipped or not.

With the changes, the internal-to-this-workflow action,
run-nextstrain-ci-build, is no longer that useful and so I've removed it
in favor of inlining things.  I think this is an improvement for
readability.

Related-to: <#95 (review)>
tsibley added a commit that referenced this pull request Jun 12, 2024
… to do anything

Instead of checking for whether to do anything or not _inside_ each
build step individually, move the check to the step's condition.  This
will make the logs a lot clearer and mean the GitHub workflow metadata
stays more in sync with reality.

I've used hashFiles() to check for file existence—it returns the empty
string, a falsey value, when there are no matching files—but we could
swap that out for using the output of a prior setup step that runs some
shell to determine what to run and what to skip.  Doing so might make
more sense if the conditional becomes more complicated or we want to do
more detailed reporting on _why_ steps were skipped or not.

With the changes, the internal-to-this-workflow action,
run-nextstrain-ci-build, is no longer that useful and so I've removed it
in favor of inlining things.  I think this is an improvement for
readability.

Related-to: <#95 (review)>
tsibley added a commit that referenced this pull request Jun 12, 2024
… to do anything

Instead of checking for whether to do anything or not _inside_ each
build step individually, move the check to the step's condition.  This
will make the logs a lot clearer and mean the GitHub workflow metadata
stays more in sync with reality.

I've used hashFiles() to check for file existence—it returns the empty
string, a falsey value, when there are no matching files—but we could
swap that out for using the output of a prior setup step that runs some
shell to determine what to run and what to skip.  Doing so might make
more sense if the conditional becomes more complicated or we want to do
more detailed reporting on _why_ steps were skipped or not.

With the changes, the internal-to-this-workflow action,
run-nextstrain-ci-build, is no longer that useful and so I've removed it
in favor of inlining things.  I think this is an improvement for
readability.

Related-to: <#95 (review)>
@tsibley
Copy link
Member

tsibley commented Jun 12, 2024

Take a gander at #96 and see what you think?

The advantage I see to doing the file existence checks inside the action is that it can report out exactly which file(s) aren't present.

Nod. Reporting that out is still possible, but I didn't bother in the PR above. It seems like a minor thing to me: if a step is skipped unexpectedly, it's easy enough to grab the commit id checked out from the logs and look at what files exist in it? But we could still maintain that reporting if desired.

Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work here @genehack! I left a minor suggestion, but the changes here LGTM!

Comment on lines 283 to 285
echo "INGEST ATTEMPTED=${{ steps.ingest.outputs.run-attempted }}"
echo "PHYLOGENETIC ATTEMPTED=${{ steps.phylogenetic.outputs.run-attempted }}"
echo "NEXTCLADE ATTEMPTED=${{ steps.nextclade.outputs.run-attempted }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion to output these to the $GITHUB_STEP_SUMMARY so that it's easy to scan the workflow attempts from the GH Action summary instead of having to dig into the job logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed an update — was that what you were thinking of?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's what I was thinking, except small typo $"GITHUB_STEP_SUMMARY" -> "$GITHUB_STEP_SUMMARY"

Copy link
Contributor Author

@genehack genehack Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>_< thanks -- pushed a fix, gonna merge this once CI finishes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-06-13 at 10 18 08 AM

🎉

@genehack genehack force-pushed the bring-tha-noise-92 branch 2 times, most recently from 1bad99e to df99e08 Compare June 13, 2024 00:36
* Add ids to build steps in `pathogen-repo-ci`
* Add `run-attempted` output to `run-nextstrain-ci-build`
* Set output to true or false depending on whether build was attempted
* Add step to `pathogen-repo-ci` to read outputs from
  `run-nextstrain-ci-build` and validate at least one build was tried
This is testing an outdated scenario — the `pathogen-repo-ci` workflow
no longer cares whether you have example data or not; only that you
have a Snakefile and a CI-specific build config in the right spot.
@genehack genehack merged commit 116404e into master Jun 13, 2024
36 checks passed
@genehack genehack deleted the bring-tha-noise-92 branch June 13, 2024 17:43
@genehack
Copy link
Contributor Author

Take a gander at #96 and see what you think?

I went ahead and merged this; I also left you a question on #96 — maybe we can continue discussion over there? I'm largely fine with your changes, just wanting to understand the template bits.

tsibley added a commit that referenced this pull request Jun 14, 2024
… to do anything

Instead of checking for whether to do anything or not _inside_ each
build step individually, move the check to the step's condition.  This
will make the logs a lot clearer and mean the GitHub workflow metadata
stays more in sync with reality.

I've used hashFiles() to check for file existence—it returns the empty
string, a falsey value, when there are no matching files—but we could
swap that out for using the output of a prior setup step that runs some
shell to determine what to run and what to skip.  Doing so might make
more sense if the conditional becomes more complicated or we want to do
more detailed reporting on _why_ steps were skipped or not.

With the changes, the internal-to-this-workflow action,
run-nextstrain-ci-build, is no longer that useful and so I've removed it
in favor of inlining things.  I think this is an improvement for
readability.

Related-to: <#95 (review)>
tsibley added a commit that referenced this pull request Jun 14, 2024
… to do anything

Instead of checking for whether to do anything or not _inside_ each
build step individually, move the check to the step's condition.  This
will make the logs a lot clearer and mean the GitHub workflow metadata
stays more in sync with reality.

I've used hashFiles() to check for file existence—it returns the empty
string, a falsey value, when there are no matching files—but we could
swap that out for using the output of a prior setup step that runs some
shell to determine what to run and what to skip.  Doing so might make
more sense if the conditional becomes more complicated or we want to do
more detailed reporting on _why_ steps were skipped or not.

With the changes, the internal-to-this-workflow action,
run-nextstrain-ci-build, is no longer that useful and so I've removed it
in favor of inlining things.  I think this is an improvement for
readability.

Related-to: <#95 (review)>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants