Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: rework pipeline: short/extended based on labels #11324

Merged
merged 5 commits into from
Jul 1, 2024

Conversation

ant31
Copy link
Contributor

@ant31 ant31 commented Jun 25, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

  • The CI VMs are requesting more resources than they are using. It's filling up the servers quickly
  • Too many VMs are created per PR
  • Vagrant and Molecule jobs are using deprecated/archived docker-machine project
  1. This PR reduces the VM resource requests to help pack more VMs per server.
  2. Reduce the number of jobs that are triggered automatically on each PR.
  3. To run additional tests, it's possible to label the pr with "ci-extended", which should be equivalent to the previous pipeline
  4. To run a shorter pipeline : add the label ci-short
  5. Vagrant and Molecule jobs are running inside Kubevirt VM
  6. pre-commit runs in a single job: reduce resources usage without increasing time

Which issue(s) this PR fixes:

Fixes #8786

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/flake Categorizes issue or PR as related to a flaky test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 25, 2024
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 25, 2024
@ant31 ant31 force-pushed the reduce-vm-requests branch 4 times, most recently from acfeded to f3239e7 Compare June 25, 2024 14:16
@ant31 ant31 added the ci-extended Run additional tests label Jun 25, 2024
@ant31
Copy link
Contributor Author

ant31 commented Jun 25, 2024

/retest

@MrFreezeex
Copy link
Member

MrFreezeex commented Jun 25, 2024

Hey! This kind of approach seems quite sensible to me, but what the policy should be here? Any code change we put the extended test (meaning not a doc only change for instance)? Or is it depending on if the reviewers (as of the persons who is reviewing a PR not the role reviewer on the repo) of a PR think it needs the extended test or not?

Also the kubespray reviewers who are not approvers can't edit labels I believe or maybe that would be just me, not sure 🤷‍♂️.

@ant31
Copy link
Contributor Author

ant31 commented Jun 25, 2024

@MrFreezeex it introduces the mechanism to do it. How it is used is up for discussion and the threasold can move easily.
At least, documentation PR should not trigger massive pipeline.

Which label trigger what can be changed too.
For example, it could trigger the extended when there's a lgtm, or size/L , or area/xyz

In this PR the default pipeline is close to the previous one (removed few redundant test imo) so it's not a big change.

  • Ci-short (manual)
  • Ci-long (default)
  • Ci-extended (manual)
    As mentioned can discuss what is default what is not.

About applying labels, yes it would be the role of the reviewer to determine it. PR owner could self assess too.
For the permissions, we can try to use prow /label or an equivalent bot.

@ant31
Copy link
Contributor Author

ant31 commented Jun 26, 2024

Beside the labels, important change is that there's no more stages, but job dependencies.
It allows to streamline the job. It's not waiting the last job of stage 1 before starting the ones on stage2.

As a result, those new pipelines runs in ~25min.

https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/pipelines/1347530306

@ant31 ant31 added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed kind/flake Categorizes issue or PR as related to a flaky test. labels Jun 26, 2024
@ant31 ant31 requested a review from yankay June 29, 2024 09:29
@ant31
Copy link
Contributor Author

ant31 commented Jun 29, 2024

@MrFreezeex @VannTen @yankay

This is needed to use the new infrastructure, should make CI go through faster, and hopefully improve contributors' experiences.
Both infra are up and running, so we'll have to tear down the previous one soon.

The main blocking point could be the job that runs by default.
a) that can be changed easily; please comment on which jobs you want to add, or we can also do follow-up PR.
b) we can't test every combination of every network+container-engine+os. The tradeoff was to test at least once every option, but not the combination of them.

I tried to remove from the default pipeline only a few:
for example, Instead of :

  • ubuntu-calico
  • centos-calico,
  • ubuntu-flannel
  • centos-flannel

I go for:

  • ubuntu-calico
  • centos-flannel

We can still run full tests on schedule (daily) and for tags/releases, but not on every PR.

Copy link
Member

@MrFreezeex MrFreezeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me thanks, I see you removed the code to add the extended label, which is fine to me as we could trigger it manually on gitlab at least for now

@yankay
Copy link
Member

yankay commented Jul 1, 2024

Thanks @ant31
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ant31, MrFreezeex, yankay

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yankay
Copy link
Member

yankay commented Jul 1, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 1, 2024
@k8s-ci-robot k8s-ci-robot merged commit a0587e0 into kubernetes-sigs:master Jul 1, 2024
55 checks passed
@tico88612
Copy link
Member

/cherrypick release-2.25

@k8s-infra-cherrypick-robot

@tico88612: #11324 failed to apply on top of branch "release-2.25":

Applying: CI: reduce VM resources requests to improve scheduling
Applying: CI: Reduce default jobs; add labels(ci-full/extended) to run more test
Using index info to reconstruct a base tree...
M	.gitlab-ci/packet.yml
Falling back to patching base and 3-way merge...
Auto-merging .gitlab-ci/packet.yml
Applying: CI: use jobs dependencies instead of stages
Using index info to reconstruct a base tree...
M	.gitlab-ci.yml
M	.gitlab-ci/lint.yml
M	.gitlab-ci/packet.yml
Falling back to patching base and 3-way merge...
Auto-merging .gitlab-ci/packet.yml
Auto-merging .gitlab-ci/lint.yml
CONFLICT (content): Merge conflict in .gitlab-ci/lint.yml
Auto-merging .gitlab-ci.yml
CONFLICT (content): Merge conflict in .gitlab-ci.yml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0003 CI: use jobs dependencies instead of stages
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-2.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tico88612
Copy link
Member

/cherrypick release-2.25

@k8s-infra-cherrypick-robot

@tico88612: #11324 failed to apply on top of branch "release-2.25":

Applying: CI: reduce VM resources requests to improve scheduling
Applying: CI: Reduce default jobs; add labels(ci-full/extended) to run more test
Applying: CI: use jobs dependencies instead of stages
Using index info to reconstruct a base tree...
M	.gitlab-ci.yml
Falling back to patching base and 3-way merge...
Auto-merging .gitlab-ci.yml
CONFLICT (content): Merge conflict in .gitlab-ci.yml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0003 CI: use jobs dependencies instead of stages
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-2.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tico88612
Copy link
Member

/cherrypick release-2.25

@k8s-infra-cherrypick-robot

@tico88612: new pull request created: #11424

In response to this:

/cherrypick release-2.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. ci-extended Run additional tests cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

packet_ubuntu20-calico-aio is failing on the task "packet-ci : Wait for vms to have ipaddress assigned"
6 participants