KEP-5504: Comparable Resource Version #5505

michaelasp · 2025-08-28T17:22:33Z

One-line PR description: Adding new KEP for comparable resource version

Issue link: Comparable Resource Version #5504

Other comments:

michaelasp · 2025-08-28T17:22:58Z

lmktfy · 2025-08-28T17:29:53Z

keps/sig-api-machinery/5504-comparable-resource-version/README.md

+
+## Alternatives
+
+N/A


I think we have alternatives, and I'm about to suggest one. There may be more.

We have a rich discovery API. I think we can use that, drawing inspiration from Rust and its concept of traits.

Rust traits are like interfaces, but can also be kind of zero sized. For example, the PartialEq trait from Rust's standard library means that you can use = to check for equality. The Eq trait (which requires also implementing PartialEq) makes a promise that all values can be compared against other values. Some nullable types only implement PartialEq. There are also traits to say if some, or all, values can be compared against each other: PartialOrd and Ord.

(Eq and Ord are just marker traits; they don't add any API surface, they are just making a stronger promise than their supertrait).

We can mark our APIs with something similar to traits and declare, per API, whether resourceVersion comparisons are meaningful for sorting
Doing it like this means we can have easy soft adoption; well, relatively easy.

Every in-tree API would implement resourceVersion ordering and can mark itself as such. A metrics adapter that doesn't implement it would not need to worry about clients making the wrong assumption.

If we add something like traits, I think we could mark our stable APIs as stable, too. It wouldn't be much extra code.

There might be some other traits-like things we can think up, such as marking if an API kind uses the .metadata.generation / .status.observedGeneration pattern.

Listing that as an alternative is fine, but I think it's reasonable to proceed without needing opt-in comparability indicators in discovery.

There are three categories of APIs

1. Servers where resourceVersion is a comparable integer

built-in kube types served by kube-apiserver

built-in kube types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)

CRD-backed types served by kube-apiserver

CRD-backed types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)

extension API servers built with k8s.io/apiserver using normal CRUD storage

extension API servers built with other technologies but following kube-apiserver behavior

2. Servers where resourceVersion is not an integer

For servers like this, a local well-formedness check when comparing two resourceVersion strings triggering a comparison error (as proposed in the function signature) would be enough to guard against misinterpretation, without needing to augment discovery and make the client check discovery.

All the examples of unusual aggregated servers I'm aware of fall in this category, or have other server-side correctness issues that already mishandle resource versions in ways inconsistent with client-expected behavior:

metrics-server

“pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list

https://github.com/kubernetes-sigs/prometheus-adapter/blob/c2ae4cdaf160363151f746e253789af89f8b6c49/pkg/resourceprovider/provider.go#L244-L254

“pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list

calico

https://github.com/projectcalico/calico/blob/09c0b753c91474e72157818a480165028f620999/libcalico-go/lib/backend/k8s/resources/profile.go#L138

https://archive-os-3-26.netlify.app/calico/3.26/reference/resources/profile

“profiles” is a virtual REST API built on top of namespaces and service accounts that supports get/list/watch with RV set to “nsRV/saRV”

Porch

appears to set invalid resourceVersion values on some objects (static non-integer strings that pass equality checks incorrectly) - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/cache/dbcache/dbpackage.go#L97C22-L97C31

looks like it tries to support watch, but ignores starting resourceVersion in a non-compliant way https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/registry/porch/watch.go#L100

uses a git hash as the RV for some objects - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/externalrepo/git/package.go#L62 (1/10,000,000 chance of an all-numeric hash on a single object, 1/100,000,000,000,000 chance of two non-identical all-numeric hashes to try to compare)

3. Servers where resourceVersion is an integer, but is not comparable

Obviously this is possible, but do we have any actual known examples of real servers that do this?

I would really want to push extension API server authors to live in category 1 or 2. Either make your integer resourceVersions comparable like everyone expects (preferred), or make them clearly not integers if that's not possible.

Unless there's evidence that there are actually servers like this in use that won't adjust to be in category 1 or 2, I'd push back on the complexity of adding something like "integerResourceVersionIsComparable: false" to discovery per-type and making clients check it.

lmktfy · 2025-08-28T17:32:54Z

keps/sig-api-machinery/5504-comparable-resource-version/README.md

+
+## Summary
+
+Resource version is currently defined as an opaque string from the view of a client, with the only operation that is supported being equality comparisons. This differs from the internal apiserver implementation, where it is clearly defined as a monotonically incrementing integer. There are increasing requirements being required from clients consuming object metadata, where stronger comparisons than just equality are required.


"The apiserver" is not the only implementation of .metadata, though: you can also integrate your control plane with an extension API server via the aggregation layer.

lmktfy · 2025-08-28T17:33:44Z

keps/sig-api-machinery/5504-comparable-resource-version/README.md

+
+### Goals
+
+The goals for this KEP are fairly straightforward, firstly we will expose a utility function that clients can use on the resource version to check comparisons between resource versions. This will take the opaque resource version string and return a boolean and an error if it occurs. Along with that we will update the documentation to specify that a ResourceVersion must be a monotonically increasing integer. 


This risks implying that all APIs support this comparison. I'd prefer to make that promise on an API-by-API basis. See #5505 (comment) for the specifics of what I have in mind.

I would say that not-monotonic RV is ok for an API without Watch like Metrics API, however the advantage of having monotonic RV for informers is huge. Being able to do consistent reads, monitor staleness is deal breaker that will make non-compliant mostly API useless and force them to implement it.

I would also point out that it would force us to maintain two different informer implementations based on whether API has monotonic RV, because fully utilizing monotonic RV will motivate non trivial code changes.

keps/sig-api-machinery/5504-comparable-resource-version/README.md

keps/sig-api-machinery/5504-comparable-resource-version/kep.yaml

liggitt · 2025-08-28T20:04:06Z

keps/sig-api-machinery/5504-comparable-resource-version/README.md

+
+## Alternatives
+
+N/A


Listing that as an alternative is fine, but I think it's reasonable to proceed without needing opt-in comparability indicators in discovery.

There are three categories of APIs

1. Servers where resourceVersion is a comparable integer

built-in kube types served by kube-apiserver

built-in kube types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)

CRD-backed types served by kube-apiserver

CRD-backed types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)

extension API servers built with k8s.io/apiserver using normal CRUD storage

extension API servers built with other technologies but following kube-apiserver behavior

2. Servers where resourceVersion is not an integer

For servers like this, a local well-formedness check when comparing two resourceVersion strings triggering a comparison error (as proposed in the function signature) would be enough to guard against misinterpretation, without needing to augment discovery and make the client check discovery.

All the examples of unusual aggregated servers I'm aware of fall in this category, or have other server-side correctness issues that already mishandle resource versions in ways inconsistent with client-expected behavior:

metrics-server

“pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list

https://github.com/kubernetes-sigs/prometheus-adapter/blob/c2ae4cdaf160363151f746e253789af89f8b6c49/pkg/resourceprovider/provider.go#L244-L254

“pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list

calico

https://github.com/projectcalico/calico/blob/09c0b753c91474e72157818a480165028f620999/libcalico-go/lib/backend/k8s/resources/profile.go#L138

https://archive-os-3-26.netlify.app/calico/3.26/reference/resources/profile

“profiles” is a virtual REST API built on top of namespaces and service accounts that supports get/list/watch with RV set to “nsRV/saRV”

Porch

appears to set invalid resourceVersion values on some objects (static non-integer strings that pass equality checks incorrectly) - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/cache/dbcache/dbpackage.go#L97C22-L97C31

looks like it tries to support watch, but ignores starting resourceVersion in a non-compliant way https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/registry/porch/watch.go#L100

uses a git hash as the RV for some objects - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/externalrepo/git/package.go#L62 (1/10,000,000 chance of an all-numeric hash on a single object, 1/100,000,000,000,000 chance of two non-identical all-numeric hashes to try to compare)

3. Servers where resourceVersion is an integer, but is not comparable

Obviously this is possible, but do we have any actual known examples of real servers that do this?

I would really want to push extension API server authors to live in category 1 or 2. Either make your integer resourceVersions comparable like everyone expects (preferred), or make them clearly not integers if that's not possible.

Unless there's evidence that there are actually servers like this in use that won't adjust to be in category 1 or 2, I'd push back on the complexity of adding something like "integerResourceVersionIsComparable: false" to discovery per-type and making clients check it.

keps/sig-api-machinery/5504-comparable-resource-version/kep.yaml

keps/sig-api-machinery/5504-comparable-resource-version/README.md

liggitt

did another sweep, this is looking good

keps/sig-api-machinery/5504-comparable-resource-version/README.md

stlaz

Does merging this KEP effectively turn the InformerResourceVersion feature gate GA?

The feature gate is not very well documented but as far as I understand, it was in place just to wait until ResourceVersions are comparable to make sure the invocation of cache.Controller{}.LastSyncResourceVersion() returns a sensible result.

cc @nilekhc

keps/sig-api-machinery/5504-comparable-resource-version/README.md

liggitt · 2025-09-04T19:27:03Z

Does merging this KEP effectively turn the InformerResourceVersion feature gate GA?

Um.... not quite? (I didn't actually know about that gate). This KEP defines how client-side comparisons should happen, but those comparisons can return errors... I don't see any comparison or error-handling path in the spots where the InformerResourceVersion gate is used (or where the underlying LastSyncResourceVersion() is set or retrieved). InformerResourceVersion would have to be updated to use the defined comparison approach, and handle encountered errors properly, I think.

liggitt · 2025-09-04T20:38:49Z

I'm pretty happy with the current state. I'd like @deads2k and @jpbetz to give it a pass.

/assign
/assign @deads2k @jpbetz

michaelasp · 2025-09-04T21:17:28Z

Does merging this KEP effectively turn the InformerResourceVersion feature gate GA?

Um.... not quite? (I didn't actually know about that gate). This KEP defines how client-side comparisons should happen, but those comparisons can return errors... I don't see any comparison or error-handling path in the spots where the InformerResourceVersion gate is used (or where the underlying LastSyncResourceVersion() is set or retrieved). InformerResourceVersion would have to be updated to use the defined comparison approach, and handle encountered errors properly, I think.

I think these are two separate things, although I may be misunderstanding the feature gate. All I think that InformerResourceVersion does is return the last seen resource version of the informer, not much else. Since the dummy informer returns an empty string when this is called and we still need the last resource version the feature gate was added. I think it's unrelated to any comparison logic.

sttts · 2025-09-22T20:40:26Z

keps/sig-api-machinery/5504-comparable-resource-version/README.md

+client as well, particularly the ability to consume the resource version as an
+integer, and the ability to compare resource versions to each other for more
+than equality. Clients can use the new semantics in order to determine the
+relative order of two different resource versions for the same type.


nit: type -> kind

mmm, probably resource, actually

(if a Wardle kind was served under two resource endpoints like wardles and clusterwardles, a resourceVersion would only be comparable within the stream of one of those resources)

Switched the wording to resource stream, I think that makes the most sense? Lmk what you think.

"resource" has a specific definition, I'd probably stick with that instead of "resource stream"

Sg, updated

liggitt · 2025-09-23T14:44:53Z

lgtm

alvaroaleman · 2025-09-23T15:26:23Z

This KEP is mostly about extending the promise of the existing implementation rather than any change to it. Would it be appropriate to start relying on this as described in Helper Function as soon as this KEP merges, including for older Kubernetes APIServer versions?

liggitt · 2025-09-23T15:36:54Z

This KEP is mostly about extending the promise of the existing implementation rather than any change to it. Would it be appropriate to start relying on this as described in Helper Function as soon as this KEP merges, including for older Kubernetes APIServer versions?

I would expect so. The addition of the conformance test in 1.35 will make it official, but the described approach is valid to use against any historical kube-apiserver as well.

jpbetz · 2025-09-23T17:37:09Z

/approve
/lgtm
Let's merge and iterate as needed.

k8s-ci-robot · 2025-09-23T17:37:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, michaelasp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [jpbetz]
~~keps/sig-api-machinery/OWNERS~~ [jpbetz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 28, 2025

k8s-ci-robot requested review from deads2k and fedebongio August 28, 2025 17:22

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 28, 2025

k8s-ci-robot requested review from jpbetz and liggitt August 28, 2025 17:23

michaelasp force-pushed the 5504 branch from 75017a1 to bc6b257 Compare August 28, 2025 17:24

michaelasp mentioned this pull request Aug 28, 2025

Comparable Resource Version #5504

Open

4 tasks

michaelasp force-pushed the 5504 branch 2 times, most recently from 99edbd3 to 3a110b6 Compare August 28, 2025 17:29

Comparable Resource Version KEP

d91e647

michaelasp force-pushed the 5504 branch from 3a110b6 to d91e647 Compare August 28, 2025 17:34

lmktfy reviewed Aug 28, 2025

View reviewed changes

Add sig-arch as participating sig

4ddf617

liggitt reviewed Aug 28, 2025

View reviewed changes

fix kep.yaml

5dbd238

aojea reviewed Aug 29, 2025

View reviewed changes

keps/sig-api-machinery/5504-comparable-resource-version/README.md Outdated Show resolved Hide resolved

serathius reviewed Aug 29, 2025

View reviewed changes

keps/sig-api-machinery/5504-comparable-resource-version/README.md Outdated Show resolved Hide resolved

serathius reviewed Aug 29, 2025

View reviewed changes

keps/sig-api-machinery/5504-comparable-resource-version/README.md Outdated Show resolved Hide resolved

Update kep in response to comments

8900491

liggitt reviewed Sep 3, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Sep 3, 2025

michaelasp force-pushed the 5504 branch from 845e702 to 9b3f325 Compare September 3, 2025 22:38

k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Sep 3, 2025

michaelasp force-pushed the 5504 branch from 9b3f325 to 6db9c3e Compare September 3, 2025 22:44

stlaz reviewed Sep 4, 2025

View reviewed changes

keps/sig-api-machinery/5504-comparable-resource-version/README.md Outdated Show resolved Hide resolved

respond to liggitt's review

520f591

michaelasp force-pushed the 5504 branch from 6db9c3e to 520f591 Compare September 4, 2025 15:19

k8s-ci-robot assigned deads2k, jpbetz and liggitt Sep 4, 2025

Add prr file and minor tweaks

881c2ed

liggitt mentioned this pull request Sep 11, 2025

Kubelet rejects pod with "NodeAffinity failed" due to stale informer data kubernetes/kubernetes#133997

Open

jpbetz mentioned this pull request Sep 12, 2025

resourceVersion parameter parsing error kubernetes/kubernetes#113939

Open

csviri mentioned this pull request Sep 15, 2025

Comparable Resource Versions in Kubernetes operator-framework/java-operator-sdk#2944

Open

alvaroaleman mentioned this pull request Sep 15, 2025

Idea: Provide read your own write for the cache-backed client kubernetes-sigs/controller-runtime#3320

Open

sttts reviewed Sep 22, 2025

View reviewed changes

fix nit

0599e64

michaelasp force-pushed the 5504 branch from 5d44d52 to 0599e64 Compare September 23, 2025 14:29

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 23, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 23, 2025

k8s-ci-robot merged commit cebf304 into kubernetes:master Sep 23, 2025
3 of 4 checks passed

k8s-ci-robot added this to the v1.35 milestone Sep 23, 2025

guettli mentioned this pull request Sep 29, 2025

patch.Helper() should provide access to new ResourceVersion (for wait until cache is synced) kubernetes-sigs/cluster-api#12805

Open


		## Summary

		Resource version is currently defined as an opaque string from the view of a client, with the only operation that is supported being equality comparisons. This differs from the internal apiserver implementation, where it is clearly defined as a monotonically incrementing integer. There are increasing requirements being required from clients consuming object metadata, where stronger comparisons than just equality are required.


		### Goals

		The goals for this KEP are fairly straightforward, firstly we will expose a utility function that clients can use on the resource version to check comparisons between resource versions. This will take the opaque resource version string and return a boolean and an error if it occurs. Along with that we will update the documentation to specify that a ResourceVersion must be a monotonically increasing integer.


		## Alternatives

		N/A


		## Alternatives

		N/A

KEP-5504: Comparable Resource Version #5505

KEP-5504: Comparable Resource Version #5505

Conversation

michaelasp commented Aug 28, 2025

Uh oh!

michaelasp commented Aug 28, 2025

Uh oh!

lmktfy Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

1. Servers where resourceVersion is a comparable integer

2. Servers where resourceVersion is not an integer

3. Servers where resourceVersion is an integer, but is not comparable

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmktfy Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

1. Servers where resourceVersion is a comparable integer

2. Servers where resourceVersion is not an integer

3. Servers where resourceVersion is an integer, but is not comparable

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liggitt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stlaz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liggitt commented Sep 4, 2025

Uh oh!

liggitt commented Sep 4, 2025

Uh oh!

michaelasp commented Sep 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liggitt Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liggitt commented Sep 23, 2025

Uh oh!

alvaroaleman commented Sep 23, 2025

Uh oh!

liggitt commented Sep 23, 2025

Uh oh!

jpbetz commented Sep 23, 2025

Uh oh!

lmktfy Aug 28, 2025 •

edited

Loading

lmktfy Aug 28, 2025 •

edited

Loading

liggitt Sep 22, 2025 •

edited

Loading