Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate CI from drone to GHA #383

Merged
merged 4 commits into from
Jul 2, 2024

Conversation

ericpromislow
Copy link
Contributor

@ericpromislow ericpromislow commented May 23, 2024

Related to rancher/rancher#45231

Actual PR at #394 (in order to access secrets for the publish and release actions)

Copy link
Contributor

@tomleb tomleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only been able to review the CI workflow and the change in the scripts.

Overall I think the approach of using k3d to start a cluster and Helm to start rancher is great.

Couple of things:

  • There's a lot of variables spread out in multiple scripts. I think it would be worth it to centralize them in a place to make it easy to change them. For example, the k3d version is in the k3d scripts, some versions are in the ci.yaml file, etc.
  • I'd get rid of the default values for most environment variables and just fail if they are unset. This way we avoid having things work with default (potentially stale) values if we remove the override by mistake.
  • The .hide-drone.yml file should be deleted instead of hidden.
  • Some scripts duplicate work (eg: scripts/package and scripts/package-ci). Fwiw, Dapper makes it difficult (it doesn't solve that in any way), and I'm slowly working on an alternative which hopefully would prevent this..

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
scripts/version Outdated Show resolved Hide resolved
scripts/package-for-ci Outdated Show resolved Hide resolved
scripts/ci Outdated Show resolved Hide resolved
scripts/integration-test Outdated Show resolved Hide resolved
scripts/integration-test Outdated Show resolved Hide resolved
@@ -11,4 +11,4 @@ annotations:
catalog.cattle.io/os: linux
catalog.cattle.io/permits-os: linux,windows
catalog.cattle.io/rancher-version: ">= 2.9.0-0 < 2.10.0-0"
catalog.cattle.io/kube-version: ">= 1.23.0-0 < 1.29.0-0"
catalog.cattle.io/kube-version: ">= 1.23.0-0 < 1.30.0-0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo this should be done as part of #366. The current version of webhook doesn't support 1.29 so I wouldn't expect the chart to say it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: this should be done as part of PR 366

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericpromislow This is still part of the PR fyi, I still think it should be removed

scripts/integration-test Outdated Show resolved Hide resolved
scripts/integration-test Outdated Show resolved Hide resolved
scripts/integration-test Outdated Show resolved Hide resolved
scripts/integration-test Outdated Show resolved Hide resolved
@ericpromislow
Copy link
Contributor Author

The key to getting an image into the k3d cluster's registry is to not add a repository prefix to it. Just call it rancher/webhook and then when a pod controller tries to load it, it will look in the local registry first before looking at docker.io or any other configured repository.

@ericpromislow ericpromislow requested a review from tomleb May 28, 2024 16:57
@ericpromislow
Copy link
Contributor Author

Also a note that I added set -u to most of the scripts so that they should fail when trying to access an unset variable.

@ericpromislow ericpromislow requested a review from a team May 28, 2024 17:41
@ericpromislow ericpromislow force-pushed the gha-migration branch 7 times, most recently from 8a2f6cd to f71cdc3 Compare June 3, 2024 21:50
Copy link
Contributor

@MbolotSuse MbolotSuse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor issues, but overall a lot of good work so far.

package/Dockerfile Outdated Show resolved Hide resolved
scripts/build Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/release.yaml Outdated Show resolved Hide resolved
.goreleaser.yaml Outdated
release:
prerelease: auto
extra_files:
- glob : ./build/artifacts/*.tgz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current artifacts also contains sha256 files for the binaries, and a source code zip file example. Do we need to add extra patterns here to make sure this gets everything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. From a debug run:

Run find dist/artifacts
dist/artifacts
dist/artifacts/webhook-linux-amd64
dist/artifacts/rancher-webhook-0.0.0-2bfcb08.tgz

I assume the goreleaser knows about webhook-linux-amd64 and needs to be told to save the .tgz file. And the checksum entry is based on the checksum file names in the current release.

I don't see a straightforward way to test this until we push the next release :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sha256 files seem to be missing from there. So I think even now we can say that something is missing. That being said, I agree that it's hard to test these things outside of a live run. Running an RC once this has merged should give you a way to test. Can you look at the sha256 issue and then we can run the live tests later after things have merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going by the goreleaser documentation:

checksum:
  # You can change the name of the checksums file.
  #
  # Default: '{{ .ProjectName }}_{{ .Version }}_checksums.txt'
  #   or, when split is set: '{{ .ArtifactName }}.{{ .Algorithm }}'
  # Templates: allowed
  name_template: "{{ .ProjectName }}_checksums.txt"

So I think that will give us what we want.

scripts/integration-test Show resolved Hide resolved
scripts/package-for-ci Outdated Show resolved Hide resolved
scripts/version Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved

jobs:
ci :
uses: rancher/webhook/.github/workflows/ci.yaml@release/v0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyway to avoid hardcoding the branch? I suspect GHA has a variable for the "current branch" that we can use here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github.base_ref returns the branch in ci.yaml, I'll need to verify this works in the other workloads (see Michael's comment on further testing of the other workloads still needed).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I've found about re-using a workflow that's part of the current repo. Have a look at https://docs.github.com/en/actions/using-workflows/reusing-workflows#calling-a-reusable-workflow:

If you use the second syntax option (without {owner}/{repo} and @{ref}) the called workflow is from the same commit as the caller workflow.

This would look like:

jobs:
  ci :
    uses: ./.github/workflows/ci.yml

This way no need to fiddle with any branch names/commits,etc.


jobs:
ci:
uses: rancher/webhook/.github/workflows/ci.yaml@release/v0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyway to avoid hardcoding the branch? I suspect GHA has a variable for the "current branch" that we can use here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

.goreleaser.yaml Outdated
- -extldflags
- -static
- -s
- -X main.Version={{.Version}} -X main.GitCommit={{.Commit}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my main concern is now we have 2 different binaries being built:

  • The one with gorelease on releases
  • The one with script/build

The binaries built in the latter are copied in the docker image and pushed to the container registry.

The binaries built in the former are pushed to the release/pre-release. I'm not sure what they're used for tbh, since afaik what's most important is the Helm chart (.tgz) for the charts repo and the docker image.

We're not trying to redo all our CI, so I'm not suggesting we drop the binaries from the release. However I wanna understand the reason to have two different ways to build the binaries, used in two different artifacts. Maintenance-wise, these two build mechanism will need to be kept in sync. I haven't used goreleaser before so I'd like to know how it benefits us for our use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a directive around somewhere that we should avoid storing binaries in artifacts directories for purposes of SLSA level-3 compliance, so I didn't see a way that we could test the same binaries we push to github without loading everything into one job.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything that would prevent replacing the goreleaser step with a make build step, or something like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we want to use goreleaser because it automates more of the release process.

.github/workflows/release.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
@ericpromislow ericpromislow requested a review from tomleb June 5, 2024 21:06
Copy link
Contributor

@MbolotSuse MbolotSuse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes. Also, would be good to switch the actions to use a SHA instead of the tag to give better control to upgrades.

on:
push:
tags:
- release/v*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it has a couple issues to me:

  • If the regex is trying to match a tag, I would expect this to be v* - the tags are the exact version like v0.5.1, not release/v0.5.1
  • Per this comment, I would expect this to need to be quoted

Copy link
Contributor Author

@ericpromislow ericpromislow Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it needs to be quoted because the string value doesn't start with a * (in which case it would look like an anchor):

$ pbpaste | yq .
{
  "on": {
    "push": {
      "tags": [
        "release/v*"
      ]
    }
  }
}

on:
push:
tags:
- release/v*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here about the tag name.

@ericpromislow
Copy link
Contributor Author

Actions now use SHAs

Comment on lines 31 to 32
# TODO: See wrangler for an example of how to run tests on arm64, not ready here
# - arm64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With wrangler supporting both, we should be able to make it work here as well.

@ericpromislow
Copy link
Contributor Author

ericpromislow commented Jun 11, 2024

Continued at PR #394 in order to access secrets for the publish and release actions.

@ericpromislow
Copy link
Contributor Author

We're going to move ahead with this PR and skip the failing test on ARM64 CI runs ... this is logged at
rancher/rancher#45837

tomleb
tomleb previously approved these changes Jun 14, 2024
.goreleaser.yaml Outdated
@@ -0,0 +1,41 @@
# Make sure to check the documentation at https://goreleaser.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed? I don't see anything using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It came from the boilerplate. I'll modify it because I like to have pointers to docs for third-party components, even when the link is obvious

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, do we still need goreleaser? I thought it wasn't being used anymore (the tool).

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
tests/integration/rkeMachineConfig_test.go Outdated Show resolved Hide resolved
- Misc. changes to CI caught during further testing:
1. There should be two shell variables, HELM_VERSION for pulling
   down the right version of helm, and HELM_CHART_VERSION, for identifying
   generated helm charts.

2. Similarly, `TAG` and `IMAGE_TAG` are two different things.

3. Stop using make/dapper: env vars get lost

4. Skip the RKE Machine config unit test on CI/arm64
   - Written up in issue 45837
tomleb
tomleb previously approved these changes Jul 2, 2024
@MbolotSuse
Copy link
Contributor

LGTM in interest of functionality, some things we will want to follow up on.

@ericpromislow ericpromislow merged commit 0a8155d into rancher:release/v0.5 Jul 2, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants