Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Jenkins CI #1240

Merged
merged 1 commit into from
Sep 17, 2020
Merged

Refactor Jenkins CI #1240

merged 1 commit into from
Sep 17, 2020

Conversation

lzhecheng
Copy link
Contributor

@lzhecheng lzhecheng commented Sep 14, 2020

  • Move scripts in macros.yaml to separate shell files in order to ensure maintainability.
  • run-k8s-e2e-test.sh: Move KUBE_CONFORMANCE_IMAGE_VERSION initialization before _usage. Otherwise, this var is unset.
  • Set kube conformance image version from v1.18.0-beta.1 to v1.18.5.
  • sonobuoy: v0.17.2 -> v0.18.5

Fixed: #768

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@lzhecheng lzhecheng changed the title Refactor Jenkins CI [WIP] Refactor Jenkins CI Sep 14, 2020
@codecov-commenter
Copy link

codecov-commenter commented Sep 14, 2020

Codecov Report

Merging #1240 into master will increase coverage by 0.21%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1240      +/-   ##
==========================================
+ Coverage   56.06%   56.28%   +0.21%     
==========================================
  Files         108      109       +1     
  Lines       11887    12037     +150     
==========================================
+ Hits         6665     6775     +110     
- Misses       4634     4673      +39     
- Partials      588      589       +1     
Flag Coverage Δ
#integration-tests 46.05% <ø> (-0.28%) ⬇️
#unit-tests 42.55% <ø> (+0.31%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/types/networkpolicy.go 16.66% <0.00%> (-8.34%) ⬇️
pkg/agent/openflow/network_policy.go 73.61% <0.00%> (-0.84%) ⬇️
pkg/agent/openflow/client.go 35.60% <0.00%> (ø)
pkg/agent/util/sysctl/sysctl_linux.go 0.00% <0.00%> (ø)
pkg/agent/proxy/types/types.go 0.00% <0.00%> (ø)
...ntroller/networkpolicy/networkpolicy_controller.go 73.28% <0.00%> (+0.24%) ⬆️
pkg/agent/openflow/pipeline.go 57.11% <0.00%> (+0.31%) ⬆️
pkg/ovs/openflow/ofctrl_bridge.go 72.24% <0.00%> (+0.66%) ⬆️
pkg/apiserver/storage/ram/store.go 81.69% <0.00%> (+1.30%) ⬆️
pkg/ovs/openflow/ofctrl_action.go 83.79% <0.00%> (+1.58%) ⬆️
... and 2 more

@lzhecheng lzhecheng force-pushed the ci-refactor branch 4 times, most recently from 828f255 to 56585f9 Compare September 14, 2020 08:41
@lzhecheng lzhecheng changed the title [WIP] Refactor Jenkins CI Refactor Jenkins CI Sep 14, 2020
@antoninbas
Copy link
Contributor

@lzhecheng you can reference this issue in this PR: #768. So that the issue gets closed when you merge this.

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this task. It's been needed for while IMO.

RUN_NETWORK_POLICY=false
RUN_E2E_FOCUS=""
KUBECONFIG_OPTION=""
E2E_CONFORMANCE_SKIP="\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]|\[sig-cli\]|\[sig-storage\]|\[sig-auth\]|\[sig-api-machinery\]|\[sig-apps\]|\[sig-node\]"
MODE="report"
THIS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
KUBE_CONFORMANCE_IMAGE_VERSION="$(head -n1 $THIS_DIR/k8s-conformance-image-version)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to update the version included in k8s-conformance-image-version. We haven't done it in a while and it's still pointing at a beta version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it to auto, which is the default value. I remember we set it to a fixed version because of some bug/issue before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks

fi
echo "=== FAILURE !!! ==="
exit 1
./ci/test-vmc.sh --testcase integration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will be the cluster name in this case?

Copy link
Contributor Author

@lzhecheng lzhecheng Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test doesn't need one because it runs on a fixed VM. The VM's name is "antrea-integration-0".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's right there is no K8s cluster for integration tests. But I though we created a new VM every time? Is it the case or do we use the same VM always?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same VM gets reverted to a snapshot everytime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right, thanks for the reminder

ci/test-vmc.sh Outdated
;;
--work-home)
WORK_HOME="$2"
KUBECONFIG_PATH=$WORK_HOME/.kube/config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend doing this here. If the user provides both --kubeconfig and --work-home, the behavior may be unexpected based on the order of these options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I will remove this line here. And after this block of code, new code will check if WORK_HOME is changed and KUBECONFIG_PATH is not, then KUBECONFIG_PATH should be $WORK_HOME/.kube/config

ci/test-vmc.sh Outdated
Comment on lines 39 to 50
Setup a VMC cluster to run K8s e2e community tests (E2e, Conformance, whole Conformance & Network Policy).

--cluster-name The cluster name to be used for the generated VMC cluster. Must be specified if not run in Jenkins environment.
--kubeconfig Path to save kubeconfig of generated VMC cluster.
--work-home Home path for Go, vSphere information and antrea_logs during cluster setup. Default is $WORK_HOME.
--log-mode Use the flag to set either 'report', 'detail', or 'dump' level data for sonobouy results.
--testcase The testcase to run: e2e, conformance, whole-conformance or networkpolicy.
--garbage-collection Do garbage collection to clean up some useless testbeds
--setup-only Only perform setting up the cluster and run test.
--cleanup-only Only perform cleaning up the cluster."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with the script as it is, it is going to be really impractical (impossible?) to use it outside of a Jenkins environment. I think this is fine to restrict this specific script to Jenkins, in which case you should:

  • remove "Must be specified if not run in Jenkins environment."
  • move the script from ci/ to ci/jenkins/

If you want to run the script outside of a Jenkins environment, more changes are required IMO: remove occurrences of jenkins in the script, set the Go environment variables (e.g. GOPATH, GOCACHE, ...) from Jenkins and not from the script, revisit WORK_HOME (rename it to WORKDIR, use it more sparingly), ability for the user to provide CLUSTERDEFAULTS, ...

I'll let you decide which route you want to go, but probably it's much easier to go with the first option. Renaming WORK_HOME to WORKDIR may be a good idea regardless, unless WORK_HOME is commonly used in the Jenkins ecosystem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Let's go with the first option. Thanks.

ci/test-vmc.sh Outdated
--work-home Home path for Go, vSphere information and antrea_logs during cluster setup. Default is $WORK_HOME.
--log-mode Use the flag to set either 'report', 'detail', or 'dump' level data for sonobouy results.
--testcase The testcase to run: e2e, conformance, whole-conformance or networkpolicy.
--garbage-collection Do garbage collection to clean up some useless testbeds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/useless/unused

otherwise it's mean to the testbeds :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆 Sorry, poor testbeds. You guys really did a good job.

ci/test-vmc.sh Outdated
set -e

if [[ "$CLUSTER_READY" == false ]]; then
echo "=== Failed to make all nodes up ==="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Failed to bring up all the nodes"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

ci/test-vmc.sh Outdated
exit 1
else
export KUBECONFIG="${GIT_CHECKOUT_DIR}/jenkins/out/kubeconfig"
echo "=== Waiting all nodes up for 10 min ==="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Waiting for 10 minutes for all nodes to be up"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@lzhecheng
Copy link
Contributor Author

/skip-hw-offload

@lzhecheng lzhecheng force-pushed the ci-refactor branch 4 times, most recently from 3ed4843 to c3ead98 Compare September 15, 2020 06:47
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only a couple of nits

really good job unifying the test cases into one script


--cluster-name The cluster name to be used for the generated VMC cluster.
--kubeconfig Path to save kubeconfig of generated VMC cluster.
--work-home Home path for Go, vSphere information and antrea_logs during cluster setup. Default is $WORKDIR.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should --work-hom be changed to --workdir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding me. Yes, it should be changed.

kubectl get namespace -l antrea-ci | awk '$3 ~ "[0-9][hd]" || $3 ~ "[6-9][0-9]m" || $3 ~ "1[0-9][0-9]m" && $2 ~ "Active" {print $1}' | while read cluster; do
echo "=== Currently ${cluster} has been live for more than 1h ==="
kubectl delete ns ${cluster}
echo "=== Old namespace ${cluster} is deleted !!! ==="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is deleted/has been deleted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

RUN_NETWORK_POLICY=false
RUN_E2E_FOCUS=""
KUBECONFIG_OPTION=""
E2E_CONFORMANCE_SKIP="\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]|\[sig-cli\]|\[sig-storage\]|\[sig-auth\]|\[sig-api-machinery\]|\[sig-apps\]|\[sig-node\]"
MODE="report"
THIS_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
KUBE_CONFORMANCE_IMAGE_VERSION="$(head -n1 $THIS_DIR/k8s-conformance-image-version)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks

@antoninbas
Copy link
Contributor

@lzhecheng please also mention in the commit message and PR description that we are updating the kube conformance image version (from v1.18.0-beta.1 to auto).

antoninbas
antoninbas previously approved these changes Sep 16, 2020
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* Move scripts in macros.yaml to separate shell files in order to
ensure maintainability.
* run-k8s-e2e-test.sh: Move KUBE_CONFORMANCE_IMAGE_VERSION initialization before _usage.
Otherwise, this var is unset.
* k8s-conformance-image-version: v1.18.0-beta.1 -> v1.18.5
* sonobuy version: v0.17.2 -> v0.18.5
@lzhecheng
Copy link
Contributor Author

@antoninbas , I verified on Jenkins today. "auto" as k8s-conformance-image won't work for networkpolicy testcases. It failed at "[sig-network] NetworkPolicy [LinuxOnly] NetworkPolicy between server and client [It] should allow ingress access from updated pod [Feature:NetworkPolicy]". I tried on a private testbed and it worked.
"auto" means to be set to cluster's version. So if user's cluster is at an old version, with sonobuoy, flaws in upstream code could lead to testcase failure. I guess failure is related with testbed setup.
Then I set k8s-conformance-image to v1.18.5 and sonobuoy to v0.18.5 and it works for CI. I could continue investigating the failure.

@antoninbas
Copy link
Contributor

@lzhecheng Thinking about it some more, it's probably better to have a static version there anyway, so we know exactly which versions of the tests we are running. We just have to remember to update it regularly.

@lzhecheng
Copy link
Contributor Author

@lzhecheng Thinking about it some more, it's probably better to have a static version there anyway, so we know exactly which versions of the tests we are running. We just have to remember to update it regularly.

Makes sense. Thanks!.

/skip-all

@lzhecheng lzhecheng merged commit 6160ff5 into antrea-io:master Sep 17, 2020
@lzhecheng lzhecheng deleted the ci-refactor branch September 17, 2020 02:44
@lzhecheng
Copy link
Contributor Author

VMC Jenkins VMs are at kubernetes v1.17.3, so the kube-conformance-image-version: "auto" leads to old conformance image.

GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Sep 22, 2020
* Move scripts in macros.yaml to separate shell files in order to
ensure maintainability.
* run-k8s-e2e-test.sh: Move KUBE_CONFORMANCE_IMAGE_VERSION initialization before _usage.
Otherwise, this var is unset.
* k8s-conformance-image-version: v1.18.0-beta.1 -> v1.18.5
* sonobuy version: v0.17.2 -> v0.18.5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move as much logic as possible out of Jenkins job YAML specification files
5 participants