Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove podman cni config when custom cni is used #10384

Closed

Conversation

michaelhenkel
Copy link
Contributor

@michaelhenkel michaelhenkel commented Feb 6, 2021

When a custom CNI is defined via --cni flag it might take a while to populate the CNIs configuration into /etc/cni/net.d.
Minikube comes up and in case of crio or containerd runtime it has a CNI configuration in /etc/cni/net,d already.
Now if pods using pod network (such as coredns) come up BEFORE the custom CNIs config file is populated, this pod ip address is served by podman. Subsequent pods will be served by the custom CNI but won't be able to communicate with the pods using podman CNI.
This PR removes all CNI configs from /etc/cni/net.d just before the custom CNI manifest is applied.

@k8s-ci-robot
Copy link
Contributor

Hi @michaelhenkel. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 6, 2021
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 6, 2021
@michaelhenkel michaelhenkel changed the title No podman cni config remove podman cni config when custom cni is used Feb 6, 2021
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@fejta-bot
Copy link

Unknown CLA label state. Rechecking for CLA labels.

Send feedback to sig-contributor-experience at kubernetes/community.
/check-cla

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 6, 2021
@krezovic
Copy link

krezovic commented Feb 9, 2021

There is a chicken and egg problem with calico and containerd when this PR is merged. containerd requires valid CNI config to start up, and calico will use init container to install its cni config file and cni executable.

The only way I could workaround this is start with calico cni, wait until it's fully up by using "kubectl wait", delete the non-calico config file from /etc/cni/net.d and restart kube-proxy and coredns.

Not sure if other components need restart, as they use host IPs and are static pods anyway.

@michaelhenkel
Copy link
Contributor Author

michaelhenkel commented Feb 10, 2021 via email

Copy link
Collaborator

@afbjorklund afbjorklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it trying to use the wrong CNI, and starting with podman instead of the desired one ?

It would be better if it would just fail for a while, until the correct network is available.

pkg/minikube/cni/custom.go Outdated Show resolved Hide resolved
@prezha
Copy link
Contributor

prezha commented Mar 10, 2021

as stated here (step no. 8.):

Please note that although the DNS server is deployed, it will not be scheduled until CNI is installed.

also, here are the steps used to initialise the control plane - maybe the issue could be in this part specifically:

2. Choose a Pod network add-on, and verify whether it requires any arguments to be passed to kubeadm init.
**Depending on which third-party provider you choose,
  you might need to set the --pod-network-cidr to a provider-specific value.**
See Installing a Pod network add-on.

@michaelhenkel
Copy link
Contributor Author

as stated here (step no. 8.):

Please note that although the DNS server is deployed, it will not be scheduled until CNI is installed.

also, here are the steps used to initialise the control plane - maybe the issue could be in this part specifically:

2. Choose a Pod network add-on, and verify whether it requires any arguments to be passed to kubeadm init.
**Depending on which third-party provider you choose,
  you might need to set the --pod-network-cidr to a provider-specific value.**
See Installing a Pod network add-on.

form Kubelet point of view the CNI is "installed" as soon as there is a cni config in /etc/cni/net.d. Now if you specify --cni parameter in minikube, initially the only cni config file in that directory is the podman cni config. The third party cni file will eventually land there but in between the coredns pod comes up and kubelet calls the podman CNI and the coredns pod gets an ip address from podman and not the third party cni.
As I showed this behavior applies to all CNIs which are installed via the --cni parameter and cannot get their cni config copied to /etc/cni/net.d BEFORE the coredns Pod is instantiated.

@michaelhenkel
Copy link
Contributor Author

any update?

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 17, 2021
@michaelhenkel
Copy link
Contributor Author

any update?

Copy link
Contributor

@tstromberg tstromberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this! The actual change to CNI looks good. I've added some minor quibbles, but think that overall that this can be merged quickly.

One weird thing has happened to this PR though: it seems to include all sorts of unrelated commits from master:

Screen Shot 2021-03-24 at 6 34 39 AM

Typically these show as a single merge-to-head commit, but in this PR, they are showing off as individual commits. I'm not sure if it's easier for you to remove the unrelated commits or start a new PR, but here is the workflow I use for merging against head after I've sent a PR in:

  git remote add upstream https://github.com/kubernetes/minikube
  git fetch upstream; \
    and git checkout master; \
    and git merge upstream/master; \
    and git checkout $branch; \
    and git merge master

I do this often enough that I've made it into a shell function.

Thank you again for contributing this improvement!

pkg/minikube/cni/custom.go Outdated Show resolved Hide resolved
pkg/minikube/cni/custom.go Outdated Show resolved Hide resolved
pkg/minikube/cni/custom.go Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 24, 2021
@michaelhenkel
Copy link
Contributor Author

@tstromberg thanks for the thorough review! I think I addressed all comments. Can you please take a look?

@sharifelgamal
Copy link
Collaborator

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Mar 30, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 30, 2021
@minikube-pr-bot
Copy link

kvm2 Driver
error collecting results for kvm2 driver: timing run 0 with minikube: timing cmd: [/home/performance-monitor/minikube/out/minikube start --driver=kvm2]: waiting for minikube: exit status 69
docker Driver
error collecting results for docker driver: timing run 0 with Minikube (PR 10384): timing cmd: [/home/performance-monitor/.minikube/minikube-binaries/10384/minikube start --driver=docker]: starting cmd: fork/exec /home/performance-monitor/.minikube/minikube-binaries/10384/minikube: exec format error

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 13, 2021
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 13, 2021
@michaelhenkel michaelhenkel reopened this Apr 15, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 15, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: michaelhenkel
To complete the pull request process, please assign medyagh after the PR has been reviewed.
You can assign the PR to them by writing /assign @medyagh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 15, 2021
@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 10384) |
+----------------+----------+---------------------+
| minikube start | 55.3s    | 55.4s               |
| enable ingress | 39.1s    | 38.9s               |
+----------------+----------+---------------------+

Times for minikube start: 57.4s 56.2s 51.7s 57.2s 54.1s
Times for minikube (PR 10384) start: 51.7s 57.0s 57.7s 57.0s 53.6s

Times for minikube ingress: 41.9s 36.9s 36.9s 44.4s 35.4s
Times for minikube (PR 10384) ingress: 43.0s 36.0s 35.4s 42.5s 37.5s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 10384) |
+----------------+----------+---------------------+
| minikube start | 23.3s    | 23.0s               |
| enable ingress | 34.4s    | 36.2s               |
+----------------+----------+---------------------+

Times for minikube start: 26.7s 22.2s 22.2s 22.5s 23.1s
Times for minikube (PR 10384) start: 22.7s 23.4s 22.8s 22.4s 23.7s

Times for minikube ingress: 32.5s 34.6s 34.1s 37.0s 33.6s
Times for minikube (PR 10384) ingress: 38.1s 37.6s 37.5s 35.1s 32.5s

docker driver with containerd runtime
error collecting results for docker driver: timing run 0 with minikube: timing cmd: [out/minikube addons enable ingress]: waiting for minikube: exit status 10

@k8s-ci-robot
Copy link
Contributor

@michaelhenkel: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2021
@medyagh
Copy link
Member

medyagh commented May 3, 2021

this might not be needed anymore after @prezha PRs

@medyagh medyagh closed this May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.