Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conflicting information about custom resource conversions #27534

Closed
porridge opened this issue Apr 13, 2021 · 18 comments · Fixed by #37267
Closed

Conflicting information about custom resource conversions #27534

porridge opened this issue Apr 13, 2021 · 18 comments · Fixed by #37267
Labels
language/en Issues or PRs related to English language priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.

Comments

@porridge
Copy link
Member

porridge commented Apr 13, 2021

This is a Bug Report

Problem:

One part of the "Versions in CustomResourceDefinitions" document says that conversion is performed from storage version to served version. Another says it's not (and furthermore, that k8s will lie about apiVersion by setting it without actually making any attempt at schema conversion).

Specifically:

  1. https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#webhook-conversion says:

The API server also supports webhook conversions that call an external service in case a conversion is required. For example when:

  • custom resource is requested in a different version than stored version.
  1. At the same time https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#writing-reading-and-updating-versioned-customresourcedefinition-objects says:

If you specify a version that is different from the object's persisted version, Kubernetes returns the object to you at the version you requested, but the persisted object is neither changed on disk, nor converted in any way (other than changing the apiVersion string) while serving the request.

The above sentence is not true in general. Specifically because the part:

the persisted object is neither changed on disk, nor converted in any way (other than changing the apiVersion string) while serving the request

blurs the distinction between the "persisted object" (which indeed never changes on reads) and the served representation (which definitely is converted, if only to change the apiVersion string), this sentence could be read in two ways:

  1. The persisted object is not changed on disk. The served representation is not converted (i.e. does not differ from the persisted object) other than the version string.
    This first interpretation is only correct if the schemas do not differ. A casual reader of a general document on conversion webhooks certainly dos not have this assumption, and definitely not before reading the example which follows the above sentence.
  2. The persisted object is not changed on disk. The only conversion applied to persisted object is that the apiVersion string is changed. This conversion to persisted object happens when the request is being served.
    This second interpretation is just plain wrong. The on-disk object does not change. Period.

Then later, there is an example which describes a special case of serving a v1 representation of an object persisted at v1beta1 version:

  • You read your object at version v1beta1, then you read the object again at version v1. Both returned objects are identical except for the apiVersion field.

This sentence is correct, and in this particular context the first sentence makes more sense (but is still blurry).

Proposed Solution:

  1. restructure the first sentence to:
    1. match what happens in the general case of stored version != served version. Not just in the specific vXbetaY -> vX case.
    2. make an explicit distinction between (conceptually immutable on reads) persisted object and the served representation, rather to try and describe both in one go
  2. provide an example for the case where conversion webhooks are actually useful (i.e. where the schema between persisted and served versions differs). This can be an additional second example, or the existing example can be replaced.
  3. [separate issue] provide an additional example of what happens when there is no conversion webhook defined and
    1. schemas differ
    2. schemas are identical (except for the version string)
  4. [separate issue] additionally, it might be good to use "stored version" consistently, rather than "persisted version". This should also probably be a different orthogonal issue.

Page to Update:
https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning

@porridge porridge added the kind/bug Categorizes issue or PR as related to a bug. label Apr 13, 2021
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 13, 2021
@sftim
Copy link
Contributor

sftim commented Apr 13, 2021

The difference is that the persisted object is not changed no matter what version of its API you use to read it. That's what the second quote is about. If you have a persisted object and try to use a different version of that API to read or modify it, then the API server must convert its representation the persisted object to the API version that the client asked for. That's might involve a webhook for a CRD, as the first quote explains.

GitHub (this website) does something similar. This issue is #27534 in https://github.com/kubernetes/website/ no matter which API you use to access it. If GitHub adds a new kind of API for accessing issues, they might choose not to replace all their stored data; instead, they keep storing the data they have and convert on-the-fly when you access it.

I think the pages are accurate. It's a difficult concept to explain well and if you have suggestions for improving the explanation (without changing the technical meaning), we'd be keen to hear them.

/remove-kind bug
/sig api-machinery
/language en

PS Using “persisted” vs “stored” consistently could be a useful cleanup - perhaps for a separate issue.

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. language/en Issues or PRs related to English language and removed kind/bug Categorizes issue or PR as related to a bug. labels Apr 13, 2021
@sftim
Copy link
Contributor

sftim commented Apr 13, 2021

BTW, the reason that the two objects in https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#writing-reading-and-updating-versioned-customresourcedefinition-objects differ only by the API version is because that was the only change you made.

In Kubernetes, the convention when promoting an API to stable (taking alpha or beta out of the version) is that you try not to change that API when you do so).

@porridge
Copy link
Member Author

@sftim thanks for the fast response!

The difference is that the persisted object is not changed no matter what version of its API you use to read it. That's what the second quote is about. If you have a persisted object and try to use a different version of that API to read or modify it, then the API server must convert its representation the persisted object to the API version that the client asked for. That's might involve a webhook for a CRD, as the first quote explains.

Thank you for clarification, what you now described matches my understanding.

I think the pages are accurate. It's a difficult concept to explain well and if you have suggestions for improving the explanation (without changing the technical meaning), we'd be keen to hear them.

I think the confusing part was this:

nor converted in any way (...) while serving the request.

I'm guessing this is indeed the case when conversion.strategy is None. But since the preceding sections talk at length about using conversion webhooks, such statement at this point was a bit of a shock to me. :-)

I re-read the second quote and now that I have your clarification I admit it makes sense, if one assumes that the schema for the two versions is the same, as you say:

the two objects [in the second quote] differ only by the API version is because that was the only change you made

Indeed the example that follows ("To illustrate this ...", changing v1beta1 to v1) provides a hint to that, to someone who knows the semantics of version strings in Kubernetes. However this is a bit implied, and I think it would be better to make it explicit - as you say - this is a difficult concept.

I think there are two ways to improve the situation:

  1. keep the section mostly as is, but explicitly mention that it is talking about promoting to a version without changing the schema, where conversion webhook would indeed be unnecessary. Perhaps also linking to a document which explains the versioning conventions, i.e. that changing from v1beta1 to v1 should imply no schema changes, unlike v1beta1 -> v1beta2.
  2. be more general, and explicitly describe what happens in the following cases:
    • schemas differ, and there is a conversion webhook (response is converted, but stored object stays intact)
    • schemas differ, but there is no conversion webhook (that was actually what I was looking for in this document, since this is a corner case with potential for data loss in some cases)
    • no schema difference and no conversion webhook (just apiVersion is bumped in the response, but stored object stays intact)

I think the latter option would be much more informative, however I lack knowledge about what happens in the middle case. If someone could chime in with a link to the relevant place in the code, I'd be more than happy to send a doc improvement PR!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 13, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 12, 2021
@sftim
Copy link
Contributor

sftim commented Aug 12, 2021

also
/wg api-expression

@k8s-ci-robot k8s-ci-robot added the wg/api-expression Categorizes an issue or PR as relevant to WG API Expression. label Aug 12, 2021
@sftim
Copy link
Contributor

sftim commented Aug 12, 2021

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 12, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 10, 2021
@porridge
Copy link
Member Author

porridge commented Nov 12, 2021 via email

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2021
@sftim
Copy link
Contributor

sftim commented Dec 3, 2021

@porridge in light of the discussion so far, would you be willing to revise the issue description to outline the defect you see or the improvement you'd like made?

@porridge
Copy link
Member Author

@sftim finally had some time to reword the issue description. Please take a look and if this makes sense I can propose a patch and file the additional issues.

@sftim
Copy link
Contributor

sftim commented Jan 17, 2022

Thanks
/triage accepted
/priority backlog

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 17, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2022
@porridge
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 18, 2022
@vaibhav2107
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 1, 2022
@porridge
Copy link
Member Author

@sftim When working on a PR to address this issue, I tried to find some documentation in https://github.com/kubernetes/community/tree/master/contributors/devel/sig-architecture to back the following claim:

In Kubernetes, the convention when promoting an API to stable (taking alpha or beta out of the version) is that you try not to change that API when you do so).

but did not find anything concrete. Do you know if this goal is actually written down somewhere, or is this tribal knowledge?

@sftim
Copy link
Contributor

sftim commented Oct 10, 2022

That's not a policy, it's an observation. SIG Architecture might have an opinion to add here that it's also an aim of the Kubernetes project.

/sig architecture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/en Issues or PRs related to English language priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants