📖 add ipam integration proposal #6000

schrej · 2022-01-26T15:47:33Z

What this PR does / why we need it:
This adds a proposal to add an API Contract for IP address management to Cluster API.

Also fixes the name of the cluster class proposal so the order of the proposals is correct.

Which issue(s) this PR fixes:

cc due to previous interest: @randomvariable @fabriziopandini @maelk @lubronzhan @MaxRink @Cellebyte @macaptain @yastij

k8s-ci-robot · 2022-01-26T15:47:34Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

linux-foundation-easycla · 2022-01-26T15:47:36Z

The committers are authorized under a signed CLA.

✅ Jakob (99a8508, f312f31)

docs/proposals/20220125-ipam-integration.md

enxebre · 2022-01-28T11:03:44Z

docs/proposals/20220125-ipam-integration.md

+
+// IPAddressClaimStatus contains the status of an IPAddressClaim
+type IPAddressClaimStatus struct {
+   // Address is a reference to the address that was allocated for this claim.


do we foresee the need conditions here to coordinate with the provider and signal the lifecycle of the claim

Good call, like IP exhaustion.
Maybe here or maybe IPAddressStatus

Yes. I thought about that a bit, but didn't get around to specify them yet. There needs to be some way to signal errors like exhausted pools for example.

To start with, we do not have to delve into extreme details on the specific conditions necessary for this. Just adding a Conditions array to the IPClaim status might be all that is necessary.

I like that idea. Ultimately the conditions could also be provider-specific.
It might make sense to have a Summary or Phase field that provides a set of values that is identical between all providers, and then have additional details in the conditions. E.g. Pending, Bound, Error.

docs/proposals/20220125-ipam-integration.md

enxebre · 2022-01-28T11:59:59Z

docs/proposals/20220125-ipam-integration.md

+```go
+// IPAddressClaimSpec describes the desired state of an IPAddressClaim
+type IPAddressClaimSpec struct {
+   // Pool is a reference to the pool from which an IP address should be allocated.


Given an IP Pool is an arbitrary API of choice for the IPAM providers, wouldn't it be possible for multiple IPAM providers to coexist and have pools with the same name? i.e wouldn't this need to be a TypedLocalObjectReference?

Good point, this is not really clear here: From the work on moving away from v1.ObjectReference I had some opinion on those reference types. This one would actually consist of Kind, Name and APIVersion. Not sure if that makes sense.

I planned to open a PR that adds those types to the current v1beta1 API to be able to be used for further additions to that version, as well as to hopefully migrate to them with v1beta2 (even though its arguably a breaking change). I think with the types already being there and discussed already, it would be a lot easier to migrate, since the discussion doesn't have to be part of the migration anymore.

+1 to what @enxebre pointed that, it would be necessary to include the Kind, APIVersion and Name of the IPPool since the IPPool implementation is tied to the IPAM providers.

I've changed it to Group, Kind and Name. Version is explicitly excluded as it actually does not make sense to have that. The client should decide on it's own which version it wants to consume.

docs/proposals/20220125-ipam-integration.md

enxebre · 2022-02-01T08:22:30Z

Is this proposal sharing any concept or implementation detail with podCIDRs for nodes proposal https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2593-multiple-cluster-cidrs?

fabriziopandini

I'm generally +1 about this proposal because it re-uses a pattern that already exists in the ecosystem and it addresses a problem getting more and more relevant

docs/proposals/20220125-ipam-integration.md

fabriziopandini · 2022-02-06T11:53:17Z

docs/proposals/20220125-ipam-integration.md

+
+- Only the custom resource definitions for IPAddress and IPAddressClaim, and relevant webhooks should become part of the main cluster-api project
+- An example provider that provides in-cluster IP Address management (also useful for testing, or when no other IPAM solution is available) should be implemented as a separate project in a separate repository.
+- A shared library for providers and consumers should be implemented as part of that extra project.


Let's add a few notes on what the API library is expected to offer

I will once I figure that out exactly 😄 I'm working on a reference in-cluster implementation, which might help with that. Not sure if it's necessary/helpful to try to design this beforehand, or if it's easier to build it when creating the first few implementations.

Makes sense; If possible we should document high level the capabilities we expect this library should offer, otherwise let's just add a note that details will be added in a follow up iteration

docs/proposals/20220125-ipam-integration.md

fabriziopandini · 2022-02-06T12:36:24Z

docs/proposals/20220125-ipam-integration.md

+     network:
+       devices:
+       - dhcp4: false
+         fromPool: # reference to the pool
+           group: ipam.cluster.x-k8s.io/v1alpha1
+           kind: IPPool
+           name: testpool


I know there was already some back and forth, but given the fact this is getting relevant from a growing number of providers, I'm wondering if we should move this to Machine level (we can also get there incrementally, but why not now?)
@vincepri @CecileRobertMichon @enxebre @sbueringer @yastij opinions?

Not opposed though based on the journey that took us from v1alpha1 to v1alpha2 where we figured the new abstractions as CRDs and the Refs and we are still evolving conditions, I think It'd be good to see some real usage to gain some knowledge, see a few ipam implementations and how's adoption from the providers before we figure the best way to plumb it into core API.

That would certainly make this proposal easier. Claims could be created by CAPI. Providers would still be required to fetch IPAddresses though, if they are used.
The hard part probably is figuring out all available parameters that can be set on Machine level, and how to add additional parameters on provider side in a good way.

@vincepri @CecileRobertMichon @sbueringer @yastij opinions?

This would be definitely save the providers from doing one-off code changes for creating the Claim objects. I will group this as an enhancement in the similar bucket as failure domains. Specifying the IPPool will follow the Ref conventions seen all through CAPI objects.
Plus it makes more sense for Machine objects to create the IPClaims as the proposal intends to introduce these API types as part of the CAPI codebase.

To me it makes more sense to have this as an optional field as part of Core vs. distributing it across infra providers.

One issue is definitely API stability and guarantees. But as a lot of providers are also already on v1beta1 I think the impact would be less if we add the field to the core Machine CRD vs in a bunch of infra providers. (with the usual feature gate mechanism)
We still have some room for changes until a potential v1beta2 or v1.

It definitely brings up the question how we would integrate this with ClusterClass as InfraMachineTemplates can be provided via templates and even patched while a MachineDeployment is generated based on fields in the topology. But I think we have a few options to make that happen.

Hm given that we need the CRDs and at least a webhook in core CAPI already I lean towards making it a part of core CAPI so we can kind of make sure that everything works as intended / coordinate if necessary

(more or less like we already handle other contracts like the InfrastructureMachine where we have a machine controller which coordinates)

If we go the path of including in core types, let's make sure the proposal is updated to reflect this and elaborate behaviour that that core controllers would own, e.g creating/deleting the IPClaim if there's a pool, etc.

docs/proposals/20220125-ipam-integration.md

schrej · 2022-03-18T10:59:30Z

For a WIP implementation of this proposal in CAPI see #6313.
A first provider implementation is available at telekom/cluster-api-ipam-provider-in-cluster.

docs/proposals/20220125-ipam-integration.md

vincepri · 2022-04-01T16:12:07Z

docs/proposals/20220125-ipam-integration.md

+
+#### Additional Notes
+
+- An example provider that provides in-cluster IP Address management (also useful for testing, or when no other IPAM solution is available) should be implemented as a separate project in a separate repository.


Should we go even further and make sure the core types contain useful information about address pools and can be managed without external information?

IP space management is a standard we can rely upon, this might be a bit of change from the current proposal, but shouldn't an IPAM provider integrate and inform Cluster API rather than providing the entire set of functionality?

+1 I think this going to be especially important If we want to leverage carving out huge CIDRs into multiple small ones

So you want to add the in-cluster provider to CAPI directly? When I started with this proposal I got the impression that you wouldn't want to have something like that as part of core CAPI.

We can definitely include a standard IPPool or InClusterIPPool as part of CAPI. It's not that much anyway, see telekom/cluster-api-ipam-provider-in-cluster.

Be aware that the proposal currently doesn't include separating large Pools into smaller subnets. It's only about allowing to integrate with the machine creation process, as there is currently no way to automate that. Allocating Subnets from a larger pool is very interesting in conjunction with ClusterClass, but is currently not part of this proposal at all, and could also be another iteration later on.

fabriziopandini

I went through the proposal in its current state and I think what is proposed is consistent with the goal to allow CAPI to consume a set of IP pools defined and managed by something external to CAPI and I think that this is already something valuable.

If I got it right from a CAPI cor PoV:

We should host two CRD - with minimal validation logic and no controllers - plus a library of functions to help implementation in providers
Ideally in CAPI there should be also a reference implementation of an IPAM provider.
It is not clearly defined how to allow users to install IPAM providers (e.g with clusterctl), but personally, I'm ok to defer this to the next iteration of this proposal.
There are no impacts on ClusterClass, because IPAM management could be activated by configuring infrastructureMachineTemplates (eventually via patches)

@schrej Is this an accurate recap?

I'll defer to @yastij @CecileRobertMichon @enxebre a final check from providers PoV, given that most of the implementation should happen on provider's side

schrej · 2022-04-08T15:45:01Z

@fabriziopandini yes, that's a very good summary, thank you!

enxebre · 2022-04-08T15:53:49Z

I think Vince had some thoughts about reconsidering the approach #6000 (comment). This lgtm otherwise.

schrej · 2022-04-20T17:17:43Z

As decided in 20th April office hours
/hold
for lazy consensus until 29th April 2022

fabriziopandini · 2022-04-21T17:08:13Z

For tracking, those are the outcomes of the discussion we had beginning this week:

Trying to summarise main point discussed today about the current proposal:

it provides support dual stack/multiple network cards by creating more than one IPAddressClaim
it assumes the network topology is modeled outside of Cluster API, and that someone informs CAPI only about the IPPools to use (if using IPAM)
machine network is modeled inside the infrastructureMachines, and this might vary provider by provider (it is already the case today, we are adding something on top only for the providers interested in offering IPAM as an option)
ideally we would like to standardize machine networking in core CAPI primitives, but at this stage it is too early to identify in beforehand an automatic migration path

Given above considerations the intent is to ask in the community meeting to proceed with current proposal after a lazy consensus, and eventually re-evaluate convergence to a shared network/machine network model in a follow up step after gaining more expertise in the domain

/lgtm

schrej · 2022-04-29T13:37:21Z

The lazy consensus on this PR expires today.
@vincepri could you give it a final review? You've voiced concerns which should hopefully be addressed now.

Rebasing to fix broken links check.

sbueringer · 2022-04-29T13:53:33Z

The lazy consensus on this PR expires today. @vincepri could you give it a final review? You've voiced concerns which should hopefully be addressed now.

Rebasing to fix broken links check.

Quick note: Vince is on PTO and should be back on Monday (afaik)

vincepri · 2022-04-29T14:00:13Z

I'll look at this this morning and go back on PTO :)

schrej · 2022-04-29T16:23:47Z

I think with this being open since 3 months it can also wait until Monday, enjoy your PTO!

vincepri

/hold cancel

fabriziopandini · 2022-05-09T18:30:30Z

/lgtm
/approve
Yay! great work @schrej, looking forward for this to be implemented

k8s-ci-robot · 2022-05-09T18:30:49Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [fabriziopandini]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 26, 2022

k8s-ci-robot requested review from JoelSpeed and stmcginnis January 26, 2022 15:48

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 26, 2022

neolit123 reviewed Jan 26, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

lubronzhan reviewed Jan 27, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

schrej force-pushed the proposal/ipam branch from 949d14b to 531523a Compare January 27, 2022 09:51

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

enxebre reviewed Jan 28, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

sbueringer mentioned this pull request Feb 1, 2022

🌱 Allow tag prefixes in PR titles kubernetes-sigs/kubebuilder-release-tools#44

Merged

enxebre mentioned this pull request Feb 2, 2022

[cabpk] Ability to configure hostname #5562

Closed

fabriziopandini reviewed Feb 6, 2022

View reviewed changes

lubronzhan reviewed Feb 11, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

lubronzhan reviewed Feb 11, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

schrej force-pushed the proposal/ipam branch from 531523a to f312f31 Compare February 14, 2022 17:58

schrej mentioned this pull request Feb 16, 2022

Add support for optional providers to clusterctl #6154

Closed

schrej mentioned this pull request Mar 18, 2022

✨ Add IPAddress and IPAddressClaim CRs to Experimental API #6313

Merged

vincepri reviewed Apr 1, 2022

View reviewed changes

docs/proposals/20220125-ipam-integration.md Outdated Show resolved Hide resolved

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

docs/proposals/20220125-ipam-integration.md Show resolved Hide resolved

vincepri reviewed Apr 1, 2022

View reviewed changes

fabriziopandini reviewed Apr 8, 2022

View reviewed changes

schrej force-pushed the proposal/ipam branch from 4fc22b4 to 1e1daaf Compare April 20, 2022 12:17

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 20, 2022

k8s-ci-robot assigned fabriziopandini Apr 21, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 21, 2022

add ipam integration proposal

6493eab

schrej force-pushed the proposal/ipam branch from 1e1daaf to 6493eab Compare April 29, 2022 13:39

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 29, 2022

vincepri approved these changes May 9, 2022

View reviewed changes

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 9, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 9, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 9, 2022

k8s-ci-robot merged commit f07e0d7 into kubernetes-sigs:main May 9, 2022

k8s-ci-robot added this to the v1.2 milestone May 9, 2022

schrej deleted the proposal/ipam branch May 9, 2022 18:42

jessehu mentioned this pull request Jul 30, 2022

Document stance on static IP addresses #1740

Closed


		#### Additional Notes

		- An example provider that provides in-cluster IP Address management (also useful for testing, or when no other IPAM solution is available) should be implemented as a separate project in a separate repository.

📖 add ipam integration proposal #6000

📖 add ipam integration proposal #6000

Conversation

schrej commented Jan 26, 2022 • edited Loading

k8s-ci-robot commented Jan 26, 2022

linux-foundation-easycla bot commented Jan 26, 2022 • edited Loading

Choose a reason for hiding this comment

lubronzhan Feb 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrej Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enxebre commented Feb 1, 2022

fabriziopandini left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbueringer Mar 11, 2022 • edited Loading

Choose a reason for hiding this comment

sbueringer Mar 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrej commented Mar 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrej Apr 4, 2022 • edited Loading

Choose a reason for hiding this comment

fabriziopandini left a comment

Choose a reason for hiding this comment

schrej commented Apr 8, 2022

enxebre commented Apr 8, 2022

schrej commented Apr 20, 2022

fabriziopandini commented Apr 21, 2022

schrej commented Apr 29, 2022 • edited Loading

sbueringer commented Apr 29, 2022

vincepri commented Apr 29, 2022

schrej commented Apr 29, 2022

vincepri left a comment

Choose a reason for hiding this comment

fabriziopandini commented May 9, 2022

k8s-ci-robot commented May 9, 2022

schrej commented Jan 26, 2022 •

edited

Loading

linux-foundation-easycla bot commented Jan 26, 2022 •

edited

Loading

lubronzhan Feb 11, 2022 •

edited

Loading

schrej Jan 28, 2022 •

edited

Loading

sbueringer Mar 11, 2022 •

edited

Loading

sbueringer Mar 11, 2022 •

edited

Loading

schrej Apr 4, 2022 •

edited

Loading

schrej commented Apr 29, 2022 •

edited

Loading