Skip to content

Conversation

Jiawei0227
Copy link
Contributor

  • One-line PR description: Update CSI Migration to GA
  • Other comments:
  • With components including AWS, GCE PD, Azuredisk go to GA. We declare CSI migration as stable. And will lock the feature flag in the next release.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 11, 2022
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Jan 11, 2022
@Jiawei0227
Copy link
Contributor Author

/cc @msau42 @xing-yang @jsafrane @gnufied
Please let me know if we want to lock-in the feature flag in this release or not.

@msau42
Copy link
Member

msau42 commented Jan 12, 2022

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 12, 2022
@wojtek-t
Copy link
Member

Will take a look into it later this week.

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple (pretty minor) comments from me from the PRR perspective.

@@ -149,7 +149,7 @@ know what configuration it’s running in and validate the expected result.
Configurations to test:

| ADC | Kubelet | Expected Result |
|-------------------|----------------------------------------------------|--------------------------------------------------------------------------|
| ----------------- | -------------------------------------------------- | ------------------------------------------------------------------------ |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting here, since I can't comment on the actual lines:

  1. The whole design repo was archived, so the links are now outdated.
    For completeness of this KEP, can you maybe just copy the design:
    https://github.com/kubernetes/design-proposals-archive/blob/main/storage/csi-migration.md
    to this directory (keps/sig-storage/625-csi-migration) and switch links to point to it.

[I'm not asking for any changes to that doc - just for easier discoverability of stuff and keeping stuff in a single place]

L246: it says "(...) we will have"
I'm particularly interested in upgrade tests, as the upgrade path is pretty tricky.
I'm assuming that those upgrades tests exist and hopefully are running somewhere continuously? Can we maybe add some links to testgrid or sth?

L280: it again says that metrics "will have fields". Have those been added? [if so, can you please update?]

L288: do we have continuous tests already? If so can you link there?
If not, can you add an explicit TODO and ensure that they will get linked here in the KEP (can be after feature freeze, but before we officially switch the feature gate in k/k to GA)? @msau42 - FYI

L305: again "will have migrated field" - have this been added?

L320: have those been implemented? if not, is there still a plan for doing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments! I moved the design proposal to this directory.

I'm particularly interested in upgrade tests, as the upgrade path is pretty tricky.
I'm assuming that those upgrades tests exist and hopefully are running somewhere continuously? Can we maybe add some links to testgrid or sth?

Yes we do have upgrade tests. The upgrade tests are managed separately by different storage plugins. For example, for GCE PD CSI driver, the upgrade test is implemented inside Google and it's one of the GKE tests running internally. So it cannot be shared here. Sorry about that. For other plugins, I believe they have their own upgrade tests running somewhere. @andyzhangx @jsafrane if you guys can help provide some links?

L280: it again says that metrics "will have fields". Have those been added

Yes, this is already implemented.

L305: again "will have migrated field" - have this been added?

Already implemented.

L320: have those been implemented? if not, is there still a plan for doing this?

Yes, it has been implemented in kubernetes/kubernetes#98979.

Copy link
Member

@jsafrane jsafrane Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenShift does not have upgrade tests from a version where CSI migration is disabled to a version where it's enabled. We do have tests that enable CSI migration post-install, which exercises almost the same procedure as upgrade - applies the feature gates in the right order and drains + restarts all kubelets. We do that for AWS EBS, GCP PD, Azure Disk and OpenStack Cinder. Cinder is not publicly available, the rest is in https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-broken#Summary

We plan to have a real upgrade jobs in few month.
(We use -broken suffix as a quarantine for jobs that will become more imporant in the future),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsafrane

Also, by asking around, I got this link that is showing exactly the gcp migration:
https://testgrid.k8s.io/provider-gcp-compute-persistent-disk-csi-driver#Migration%20Kubernetes%20Master%20Driver%20Stable

@Jiawei0227 - can you please add both links to the KEP (with the above explanation about OpenShif tests)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I also add AWS ebs and Azuredisk csi migration tests link

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 13, 2022
@msau42 msau42 added this to the v1.24 milestone Jan 14, 2022
@@ -243,7 +243,7 @@ like Provision/Deletion/Attach/Detach/Mount/Unmount will not be available if CSI

* **Are there any tests for feature enablement/disablement?**
We have CSI Migration e2e test for each plugin that are implemented and maintained by each driver maintainer.
Specifically, for each in-tree plugin corresponding CSI drivers, it will have
Specifically, for each in-tree plugin corresponding CSI drivers, it havs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/havs/has/

@wojtek-t
Copy link
Member

Thanks this looks great!

/lgtm
/approve PRR

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 14, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jiawei0227, msau42, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2022
@k8s-ci-robot k8s-ci-robot merged commit 9d6fb25 into kubernetes:master Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants