Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-361: promote Local Ephemeral Storage to GA #3422

Merged
merged 1 commit into from
Jun 23, 2022

Conversation

mattcary
Copy link
Contributor

@mattcary mattcary commented Jun 22, 2022

add KEP to promote local ephemeral storage to GA

Comments:

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 22, 2022
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Jun 22, 2022
@mattcary mattcary force-pushed the jinxu-kep branch 3 times, most recently from 00f724d to 4585a21 Compare June 22, 2022 22:34
@mattcary
Copy link
Contributor Author

@johnbelamaric

From the original PR:

So, if someone had set a limit, but the feature was disabled, the limit would have been ignored in the past. If they upgrade, the feature is now locked on. So, will any existing pods that have limits exceeded be evicted?

If so, that's not a blocker - but it IS something that needs to go in the release notes and highlighted as an upgrade risk.

The answer to "will this affect running workloads" should be YES, in the event that the feature was previously disabled. Imagine upgrading and all of a sudden a bunch of Pods get evicted and rescheduled (hopefully!)...

I've added some text as you suggest. Is there something we should do at this time to get it in the release notes?

@mattcary
Copy link
Contributor Author

/assign @xing-yang
/assign @johnbelamaric

@xing-yang
Copy link
Contributor

/approve

@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 22, 2022
@xing-yang
Copy link
Contributor

/milestone v1.25

[testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
-->

[ ] I/we understand the owners of the involved components may require updates to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be checked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

approvers:
- "@msau42"
- "@xing-yang"
see-also:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also link to kubernetes/community#306 which was where the original proposal came from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

--system-reserved=[cpu=100m][,][memory=100Mi][,][ephemeral-storage=1Gi][,][pid=1000]
--kube-reserved=[cpu=100m][,][memory=100Mi][,][ephemeral-storage=1Gi][,][pid=1000]

### Risks and Mitigations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much time has passed since beta. Now we also have CSI ephemeral volumes which can be another source of local space consumption. Can we think about what kind of impact it has to this feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated based on our offline conversation.

@johnbelamaric
Copy link
Member

This looks OK from a PRR perspective. Can you make sure there is a release note that explains the possible impact when upgrading, if a cluster currently has the feature disabled?

Will add my approval once there is SIG approval.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 23, 2022
add KEP to promote local ephemeral storage to GA
Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnbelamaric, mattcary, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 23, 2022
@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 23, 2022
@k8s-ci-robot k8s-ci-robot merged commit ee6adf7 into kubernetes:master Jun 23, 2022
have not been addressed. We feel this is best tracked with the CSI ephemeral
volumes feature.

## Design Details
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised not to see more details here - how will we get the current usage? What exactly is included and not included in that total? if I have multiple emptyDir volumes are they guaranteed to share a disk? How often will this be polled and why do we think that is good enough? How much extra IO and CPU will that use? Are we depending on kernel's quota subsystem? Why or why not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for all of that

And wearing my scalability/PRR hat, I would like to understand better the IO and CPU overhead mentioned here - how big that is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants