Skip to content

Prevent creating Rancher PR if RC not yet in rancher/charts #851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 27, 2025

Conversation

tomleb
Copy link
Contributor

@tomleb tomleb commented Mar 27, 2025

Issue:

If a PR is made to bump webhook in rancher/rancher but that version is not yet in rancher/charts, then the CI will be very broken (unless quite lucky). Here's why.

The CI in rancher/rancher runs the build-server job which builds the docker images and publishes them as artifact.

Then, the test (for integration test) job downloads those same artifacts and loads them into docker. The docker image for rancher contains a local copy of rancher/charts at the time they were built. Afaik, this is used as a cache to speed things up before the charts controllers sync with GH. When integration-test fails, we re-run only that test job, so it keeps using the previously built images.

We can verify that the previously built images contain an older version of dev-v2.11 charts branch like this:

$ export JOB_ID=14066004902
$ gh run download -R rancher/rancher $JOB_ID -n rancher-linux-amd64
$ docker load < rancher-linux-amd64.tar

$ export IMAGE=rancher/rancher:v2.11-52479284fecbb21d6b8ed85f0dfcb2f2df1d62c2-head-amd64 #  This image tag will change
$ docker run -it --entrypoint cat $IMAGE /var/lib/rancher-data/local-catalogs/v2/rancher-charts/4b40cac650031b74776e87c1a726b0484d0877c3ec137da0872547ff9b73a721/index.yaml

We can see that this index.yaml file indeed doesn't contain -rc.11 but -rc.10. In majority of cases, the charts sync wouldn't happen fast enough (could be GH issue or something else?) and the CI kept trying to install -rc.11 with an older version of index.yaml until the CI times out (5 minutes) due to webhook not being deployed.

Solution

Until we find something better (if any), we'll fail the automation for bumping webhook in rancher/rancher if the RC is not found in rancher/charts. This should prevent most cases from happening.

@tomleb tomleb requested a review from a team as a code owner March 27, 2025 01:13
@tomleb tomleb requested review from crobby and joshmeranda March 27, 2025 01:14
Copy link
Contributor

@crobby crobby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg

@tomleb tomleb merged commit a66ffbb into rancher:main Mar 27, 2025
2 checks passed
@tomleb tomleb deleted the guard-bump-rancher branch March 27, 2025 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants