Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: add notes about network setup to the "create cluster" doc #43872

Conversation

neolit123
Copy link
Member

@neolit123 neolit123 commented Nov 10, 2023

This is a top 3 question on support forums.
"How to pass custom IP to kubeadm?"

At one point there was a blog post on how to do this step-by-step, but we rejected that blog due to too many mistakes and not being a recommended practices, generally.

In this PR, we add some detail on how to do it and tag the non-default-route-IP way as not-recommended with a warning.

It is not considered as a foot gun for users, per se, but the setup is just awkward, prone to mistakes and not easy to maintain. If k8s supported something like a global /etc/kubernetes/DEFAULT_IP config of sorts, it would have seemed much more manageable.

fixes #43862
^ NOTE: marked as "fixes", because "create a cluster with kubeadm" is the main landing page for kubeadm users. from there they can choose where to go to - "init", "join", "HA", but they must read how to setup a host from the "create a cluster" page. otherwise they can hit a number of issues like - swap, OOM, etc.

NOTE: this is correctly targeting the "main" branch as documenting existing functionality.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Nov 10, 2023
@neolit123
Copy link
Member Author

/sig cluster-lifecycle
/cc @pacoxu @SataQiu
/assign @sftim

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Nov 10, 2023
Copy link

netlify bot commented Nov 10, 2023

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 34f93dd
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/6555faadd111330008fb0e87
😎 Deploy Preview https://deploy-preview-43872--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this.

I do suggest some tweaks.

@neolit123 neolit123 force-pushed the 1.29-add-notes-about-network-setup-to-create-doc branch 2 times, most recently from 5d99808 to 1e2aeca Compare November 10, 2023 16:29
@neolit123 neolit123 force-pushed the 1.29-add-notes-about-network-setup-to-create-doc branch from 1e2aeca to f43da4e Compare November 10, 2023 16:48
Comment on lines 136 to 137
Kubernetes components will try to use the one that has a global unicast IP address
(non-loopback, non-link local, non-point2point).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to document what happens when there are two of these (less common, but actually again this could apply to my laptop: I have wired ethernet and wireless and both interfaces have global unicast IPv6 addresses).

Copy link
Member Author

@neolit123 neolit123 Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic is quite complicated and a bit confusing - i.e. there are two methods.
IIRC, the second method surfaced when IPv6 support was added.

1

if /proc/net/route does not exist it will enumerate interfaces using Go lang's net.Interfaces() which is OS portable. on Linux the enumeration is done with a set of POSIX syscalls, order should be the same as ip route. but i don't want to document such a thing as implementation can vary between kernels, CLI tools, etc. there is no guarantee of order determinism.

it would then do this:

// chooseIPFromHostInterfaces looks at all system interfaces, trying to find one that is up that
// has a global unicast address (non-loopback, non-link local, non-point2point), and returns the IP.
// addressFamilies determines whether it prefers IPv4 or IPv6

https://github.com/kubernetes/apimachinery/blob/12dc3f82eb47fc972b5be0b222f4422cb1b51612/pkg/util/net/interface.go#L309C1-L312

here we can say that the first IP on an interface that satisfies the above criteria is used.

2

if /proc/net/route exists both /proc/net/route and /proc/net/route_ipv6 are parsed with only default routes.
if the preferred IP family is v6, the v6 routes take precedence.

this is more deterministic. routes are appended in that same order.

then the IP is determined slightly differently, as there is a fallback to global IP on loopback:

// chooseHostInterfaceFromRoute cycles through each default route provided, looking for a
// global IP address from the interface for the route. If there are routes but no global
// address is obtained from the interfaces, it checks if the loopback interface has a global address.
// addressFamilies determines whether it prefers IPv4 or IPv6

https://github.com/kubernetes/apimachinery/blob/12dc3f82eb47fc972b5be0b222f4422cb1b51612/pkg/util/net/interface.go#L426-L430

@aojea can surely provide more context on this, but IMO what we can document is the following, trying to find a balance between somewhat descriptive and vague (to not enter rabbit holes):

If two or more default gateways are present on the host,
a Kubernetes component will try to use the first one it encounters that has a suitable
global unicast IP address. While making this choice, the exact ordering of gateways
might vary between different operating systems and kernel versions.

Copy link
Member

@SataQiu SataQiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4a7e4099c44ec364bc67c03552f6d7f47dffc33d

Comment on lines +134 to +138
without passing a custom IP address to a Kubernetes component, the component
will exit with an error. If two or more default gateways are present on the host,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the component will exit with an error

Sometimes, this may be not true.
For kubelet, the detection order is:

// 1) Use nodeIP if set (and not "0.0.0.0"/"::")
// 2) If the user has specified an IP to HostnameOverride, use it
// 3) Lookup the IP from node name by DNS
// 4) Try to get the IP from the network interface used as default gateway
//
// For steps 3 and 4, IPv4 addresses are preferred to IPv6 addresses
// unless nodeIP is "::", in which case it is reversed.

Although there is no default route, it does not exit with -1 when DNS resolution is available.
But overall, we recommend setting a default route for hosts.

Copy link
Member Author

@neolit123 neolit123 Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a bit confused reading the kubelet code. in NodeAddress there is this comment that you mentioned and i remember it from before, but is it actually up to date?

i see a fallback to ResolveBindAddress from the apimachinery interface/IP detection code, which is what kube-proxy, apiserver, kubeadm, etc are doing:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/setters.go#L197-L225

which ends up as:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/util/net/interface.go#L426-L430
(as mentioned in the above comments)
#43872 (comment)

Although there is no default route, it does not exit with -1 when DNS resolution is available.
But overall, we recommend setting a default route for hosts.

i can see this making sense in a cloud provider env where the cloud is responsible for assigning the kubelet IPs.
looks like there is this:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/setters.go#L125-L142

but isn't this the error that is returned if no IP can be detected?
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/setters.go#L227-L230

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no default route on the node, IIRC, the kubeadm command will exit if not set in the configuration. (Not tested.)

@neolit123
Copy link
Member Author

looking for more reviews / LGTM / approve on this one.


For kubelets on all nodes, the `--node-ip` option can be passed in
`.nodeRegistration.kubeletExtraArgs` inside a kubeadm configuration file
(`InitConfiguration` or `JoinConfiguration`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dual-stack example is provided in https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/dual-stack-support/. Could we add the dual stack example link here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added link to the page under the kubelet --node-ip example.

This is a top 3 question on support forums.
"How to pass custom IP to kubeadm?"

At one point there was a blog post on how to do this step-by-step,
but we rejected that blog due to too many mistakes and not being
a recommended practices, generally.

In this PR, we add some detail on how to do it and tag
the non-default-route-IP way as not-recommended with a warning.

It is not considered as a foot gun for users, per se,
but the setup is just awkward, prone to mistakes and not easy
to maintain. If k8s supported something like a global
/etc/kubernetes/DEFAULT_IP config of sorts, it would have
seemed much more manageable.
@neolit123 neolit123 force-pushed the 1.29-add-notes-about-network-setup-to-create-doc branch from 0f84cd0 to 34f93dd Compare November 16, 2023 11:19
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 16, 2023
Copy link
Member

@pacoxu pacoxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 16, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3eea437188dd0054eb6c309e22855780c8d85650

@neolit123
Copy link
Member Author

@sftim this has received 2 LGTM's from SIG CL tech reviewers.

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change leaves the website better than without making it.

/approve

[Dual-stack support with kubeadm](/docs/setup/production-environment/tools/kubeadm/dual-stack-support).

{{< note >}}
IP addresses become part of certificates SAN fields. Changing these IP addresses would require
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IP addresses become part of certificates SAN fields. Changing these IP addresses would require
The IP addresses that you assign to control plane components become part of their X.509 certificates'
subject alternative name fields. Changing these IP addresses would require

IP addresses become part of certificates SAN fields. Changing these IP addresses would require
signing new certificates and restarting the affected components, so that the change in
certificate files is reflected. See
[Manual certificate renewal](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#manual-certificate-renewal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but this is actually replacement rather than renewal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's not possible to change a certificate in-place, the process is always re-place. although, it's documented as renewal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A renewal typically means that the new CSR and old CSR only differ by their validity period, and the motivation for renewal is expiry (ideally “imminent” rather than “recent”).
Different issuers / trust anchors have their own specific policies around whether it's a renewal or not.

I mention this because readers might think that renewal isn't the process they want.

To find out what this IP is on a Linux host you can use:

```shell
ip route show # Look for a line starting with "default via"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there's more than one default route? Readers might not be sure what to do.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pacoxu, sftim

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 22, 2023
@k8s-ci-robot k8s-ci-robot merged commit 91dbcc5 into kubernetes:main Nov 22, 2023
4 checks passed
@neolit123
Copy link
Member Author

@sftim looks like you left actionable comments, but this merged.
i will send a follow up where you can LGTM / approve.

@sftim
Copy link
Contributor

sftim commented Nov 22, 2023

I was happy to see this change land as-is; further, iterative improvement is welcome.

@neolit123
Copy link
Member Author

I was happy to see this change land as-is; further, iterative improvement is welcome.

follow up is here:
#44038

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Highlight the significance of apiserver-advertise-address in kubeadm join for HA cluster setup
5 participants