Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node Publish Mount Idempotent #1019

Merged
merged 5 commits into from
Aug 27, 2021
Merged

Conversation

nirmalaagash
Copy link
Contributor

@nirmalaagash nirmalaagash commented Aug 10, 2021

Is this a bug fix or adding new feature?
Issue Fix #955 - Node publish mount idempotent
Issue Fix #1007 - Duplicate Mount Entries.

What is this PR about? / Why do we need it?
The PR adds mount idempotent to the CSI driver which provides multiple requests of 'NodePublishVolume' with the same request parameters would not throw error. We need it for the idempotency in mount.

What testing is done?
Unit testing.
Manual testing is done using csc tool. Below are the screenshots which shows the NodePublishVolume call of same request made multiple times. At first, it publishes the volume at the target and later request provides the information that the target is already mounted.

For the issue #1007, manual testing results.
Screen Shot 2021-08-17 at 6 52 39 PM
Screen Shot 2021-08-17 at 1 27 45 PM

For filesystem volume,
Screen Shot 2021-08-17 at 7 23 03 PM
Screen Shot 2021-08-17 at 7 22 50 PM

For block volumes,
Screen Shot 2021-08-17 at 7 16 23 PM
Screen Shot 2021-08-17 at 7 16 02 PM

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 10, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @nirmalaagash. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 10, 2021
@k8s-ci-robot k8s-ci-robot requested review from ayberk and ddebroy August 10, 2021 22:26
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 10, 2021
@AndyXiangLi
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 11, 2021
@nirmalaagash
Copy link
Contributor Author

/test pull-aws-ebs-csi-driver-verify

//Checking if the path exists and error is related to Corrupted Mount, in that case, the system could unmount and mount.
_, pathErr := mountutils.PathExists(target)
if pathErr != nil && mountutils.IsCorruptedMnt(pathErr) {
klog.V(4).Infof("Target path %q is a corrupted directory", target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we not returning an error if target path is a corrupted directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chances are that the device mounted or the mount point is corrupted because of improper unmounting. Hence in the next phase we try to read the directory and unmount it.

}

if !notMnt {
//Return error when any of the directory inside is not readable which means that a device could be mounted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you clarify what this comment means? not sure i'm following

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a device is already mounted in the target directory, the directories inside the directory would not be accessible until the device is unmounted from the target directory. Hence in that case, it would return an error.

@vdhanan
Copy link
Contributor

vdhanan commented Aug 11, 2021

are all the new code paths (i.e. the if/else conditions) covered by the existing unit tests? i feel like we probably want to add a couple new ones, right?

Copy link
Member

@bertinatto bertinatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this, @nirmalaagash.

Does your fix also work block volumes? It seems we're only covering filesystem volumes.

@nirmalaagash
Copy link
Contributor Author

Thank you for working on this, @nirmalaagash.

Does your fix also work block volumes? It seems we're only covering filesystem volumes.

@bertinatto Yes. This PR only covers the filesystem volumes.

2. false, nil when the path is already mounted with a device.
3. true, nil when the path is not mounted with any device.
*/
notMnt, err := d.mounter.IsLikelyNotMountPoint(target)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An easy way to cover block devices as well is to move the check right below the in flight check.

You may want to take a look at the solutions both GCP PD and Azure Disk implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bertinatto Thanks. New commit covering block devices is pushed.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 18, 2021
@nirmalaagash
Copy link
Contributor Author

nirmalaagash commented Aug 18, 2021

are all the new code paths (i.e. the if/else conditions) covered by the existing unit tests? i feel like we probably want to add a couple new ones, right?

@vdhanan Commit include Unit Test Cases covering the conditions implemented. Thanks.

@ialidzhikov
Copy link
Contributor

/sig storage
/area provider/aws
/kind bug
/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. area/provider/aws Issues or PRs related to aws provider kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Aug 24, 2021
@vdhanan
Copy link
Contributor

vdhanan commented Aug 25, 2021

/lgtm

@vdhanan
Copy link
Contributor

vdhanan commented Aug 25, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nirmalaagash, vdhanan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 25, 2021
@nirmalaagash
Copy link
Contributor Author

/retest

1 similar comment
@nirmalaagash
Copy link
Contributor Author

/retest

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 25, 2021
@ialidzhikov
Copy link
Contributor

The verify job fails on hack/verify-gofmt:

Verifying gofmt
diff -u ./pkg/cloud/aws_metrics.go.orig ./pkg/cloud/aws_metrics.go
--- ./pkg/cloud/aws_metrics.go.orig	2021-08-25 17:39:13.750184141 +0000
+++ ./pkg/cloud/aws_metrics.go	2021-08-25 17:39:13.750184141 +0000
@@ -1,3 +1,4 @@
+//go:build !providerless
 // +build !providerless
 
 /*
diff -u ./pkg/mounter/safe_mounter_unix.go.orig ./pkg/mounter/safe_mounter_unix.go
--- ./pkg/mounter/safe_mounter_unix.go.orig	2021-08-25 17:39:13.758184690 +0000
+++ ./pkg/mounter/safe_mounter_unix.go	2021-08-25 17:39:13.758184690 +0000
@@ -1,3 +1,4 @@
+//go:build linux || darwin
 // +build linux darwin
 
 /*
diff -u ./pkg/mounter/safe_mounter_windows.go.orig ./pkg/mounter/safe_mounter_windows.go
--- ./pkg/mounter/safe_mounter_windows.go.orig	2021-08-25 17:39:13.762184964 +0000
+++ ./pkg/mounter/safe_mounter_windows.go	2021-08-25 17:39:13.762184964 +0000
@@ -1,3 +1,4 @@
+//go:build windows
 // +build windows
 
 /*
diff -u ./pkg/driver/mount_windows.go.orig ./pkg/driver/mount_windows.go
--- ./pkg/driver/mount_windows.go.orig	2021-08-25 17:39:13.770185513 +0000
+++ ./pkg/driver/mount_windows.go	2021-08-25 17:39:13.770185513 +0000
@@ -1,3 +1,4 @@
+//go:build windows
 // +build windows
 
 /*
diff -u ./pkg/driver/node_windows.go.orig ./pkg/driver/node_windows.go
--- ./pkg/driver/node_windows.go.orig	2021-08-25 17:39:13.786186611 +0000
+++ ./pkg/driver/node_windows.go	2021-08-25 17:39:13.786186611 +0000
@@ -1,3 +1,4 @@
+//go:build windows
 // +build windows
 
 /*
diff -u ./pkg/driver/node_linux.go.orig ./pkg/driver/node_linux.go
--- ./pkg/driver/node_linux.go.orig	2021-08-25 17:39:13.818188805 +0000
+++ ./pkg/driver/node_linux.go	2021-08-25 17:39:13.818188805 +0000
@@ -1,3 +1,4 @@
+//go:build linux
 // +build linux
 
 /*
diff -u ./pkg/driver/mount_linux.go.orig ./pkg/driver/mount_linux.go
--- ./pkg/driver/mount_linux.go.orig	2021-08-25 17:39:13.858191550 +0000
+++ ./pkg/driver/mount_linux.go	2021-08-25 17:39:13.858191550 +0000
@@ -1,3 +1,4 @@
+//go:build linux
 // +build linux
 
 /*

Please run hack/update-gofmt to fix the issue(s)
make: *** [Makefile:143: verify] Error 1
+ EXIT_VALUE=2
+ set +o xtrace

New //go:build tag is added. For some reason it is using gofmt@v1.17. See similar issue kubernetes-sigs/controller-tools#594.
cc @wongma7

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 25, 2021
@ialidzhikov
Copy link
Contributor

@nirmalaagash , aca3620 looks good. This is also how k/k handled it - see https://github.com/kubernetes/kubernetes/pull/103692/files.

PS: you have to fix the author of commit aca3620.

@ialidzhikov
Copy link
Contributor

@nirmalaagash , there is still commit aca3620 with wrong author. I think you can drop this commit from your change as there is already 77cc317.

@nirmalaagash
Copy link
Contributor Author

/check-cla

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 25, 2021
@ialidzhikov
Copy link
Contributor

ping @bertinatto @vdhanan
for review

@vdhanan
Copy link
Contributor

vdhanan commented Aug 27, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 27, 2021
@k8s-ci-robot k8s-ci-robot merged commit edb4fbe into kubernetes-sigs:master Aug 27, 2021
@ialidzhikov
Copy link
Contributor

Hi folks 👋,

ls it possible to backport this fix to the supported version? Thanks in advance

k8s-ci-robot added a commit that referenced this pull request Sep 2, 2021
…019-origin-release-1.2

[release-1.2] Automated cherry pick of #1019: Added Mount Idempotency
k8s-ci-robot added a commit that referenced this pull request Sep 2, 2021
…019-origin-release-1.1

[release-1.1] Automated cherry pick of #1019: Added Mount Idempotency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Restart of kubelet leads to duplicated mount entries Make NodePublish mount idempotent
6 participants