Skip to content

fix: add locks for nodeserver publish/unpublish operations #574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 24, 2023

Conversation

andyzhangx
Copy link
Member

@andyzhangx andyzhangx commented Dec 23, 2023

What type of PR is this?
/kind bug

What this PR does / why we need it:
fix: add locks for nodeserver publish/unpublish operations

Per CSI(Container Storage Interface), if a mount operation is in-progress for the volume then CSI driver should return error code "10 ABORTED"

Which issue(s) this PR fixes:

Fixes #569 #209

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix: add locks for nodeserver publish/unpublish operations

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 23, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 23, 2023
@andyzhangx andyzhangx merged commit b5dcdf7 into kubernetes-csi:master Dec 24, 2023
@GImmekerAFP
Copy link

Short question in regards to this issue ?
Is this fix already implemented in the docker image ?
i am currently using the registry.k8s.io/sig-storage/nfsplugin:v4.5.0 image.
I can see it got updated 12 hours ago but i keep getting the same issue.

With kind regards,
Gerben Immeker

@andyzhangx
Copy link
Member Author

Short question in regards to this issue ? Is this fix already implemented in the docker image ? i am currently using the registry.k8s.io/sig-storage/nfsplugin:v4.5.0 image. I can see it got updated 12 hours ago but i keep getting the same issue.

With kind regards, Gerben Immeker

it's only updated in canary image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary, could you try this image first?

@GImmekerAFP
Copy link

Thanks for the quick response i will test that one.

@GImmekerAFP
Copy link

unfortunately it crashes:

│ nfs I1227 13:24:27.766447 1 nfs.go:84] Driver: nfs.csi.k8s.io version: v4.6.0
│ liveness-probe I1227 13:11:05.857366 1 main.go:149] calling CSI driver to discover driver name
│ nfs I1227 13:24:27.767487 1 nfs.go:134]
│ liveness-probe I1227 13:11:05.866340 1 main.go:155] CSI driver name: "nfs.csi.k8s.io"
│ nfs DRIVER INFORMATION:
│ liveness-probe I1227 13:11:05.866383 1 main.go:183] ServeMux listening at "0.0.0.0:29653"
│ nfs -------------------
│ liveness-probe E1227 13:13:37.478059 1 main.go:64] failed to establish connection to CSI driver: context deadline exceeded │
│ nfs Build Date: "2023-12-26T04:34:05Z"
│ nfs Compiler: gc
│ liveness-probe W1227 13:13:44.479077 1 connection.go:183] Still connecting to unix:///csi/csi.sock
│ nfs Driver Name: nfs.csi.k8s.io
│ nfs Driver Version: v4.6.0
│ nfs Git Commit: ""
│ nfs Go Version: go1.21.5
│ nfs Platform: linux/amd64
│ nfs
│ nfs Streaming logs below:
│ nfs I1227 13:24:27.767680 1 mount_linux.go:274] Cannot create temp dir to detect safe 'not mounted' behavior: mkdir /tmp/kubelet-detect-safe-umount3961689824: read-only file system │
│ nfs I1227 13:24:27.768252 1 server.go:117] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"} │
│ nfs panic: interface conversion: interface {} is *csi.NodeGetVolumeStatsResponse, not csi.NodeGetVolumeStatsResponse │
│ node-driver-registrar I1227 13:11:05.074999 1 main.go:135] Version: v2.9.0
│ nfs
│ node-driver-registrar I1227 13:11:05.078348 1 main.go:136] Running node-driver-registrar in mode=
│ node-driver-registrar I1227 13:11:05.078485 1 main.go:157] Attempting to open a gRPC connection with: "/csi/csi.sock" │
│ node-driver-registrar I1227 13:11:06.082375 1 main.go:164] Calling CSI driver to discover driver name
│ nfs goroutine 75 [running]:
│ nfs github.com/kubernetes-csi/csi-driver-nfs/pkg/nfs.(*NodeServer).NodeGetVolumeStats(0xc000010048, {0xc00002f890?, 0x4105a5?}, 0xc0001ef5e0) │
│ nfs /workspace/pkg/nfs/nodeserver.go:219 +0xb74
│ node-driver-registrar I1227 13:11:06.088428 1 main.go:173] CSI driver name: "nfs.csi.k8s.io"
│ node-driver-registrar I1227 13:11:06.088518 1 node_register.go:55] Starting Registration Server at: /registration/nfs.csi.k8s.io-reg.sock │
│ nfs github.com/container-storage-interface/spec/lib/go/csi._Node_NodeGetVolumeStats_Handler.func1({0x18c6a18, 0xc000441d70}, {0x15a16e0?, 0xc0001ef5e0}) │
│ node-driver-registrar I1227 13:11:06.088921 1 node_register.go:64] Registration Server started at: /registration/nfs.csi.k8s.io-reg.sock │
│ node-driver-registrar I1227 13:11:06.089380 1 node_register.go:88] Skipping HTTP server because endpoint is set to: "" │
│ node-driver-registrar I1227 13:11:06.202145 1 main.go:90] Received GetInfo call: &InfoRequest{}
│ node-driver-registrar I1227 13:11:06.234280 1 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,} │
│ nfs /workspace/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:7106 +0x72
│ nfs github.com/kubernetes-csi/csi-driver-nfs/pkg/nfs.logGRPC({0x18c6a18, 0xc000441d70}, {0x15a16e0?, 0xc0001ef5e0?}, 0xc00069e3e0, 0xc0006995f0) │
│ nfs /workspace/pkg/nfs/utils.go:112 +0x3a9
│ nfs github.com/container-storage-interface/spec/lib/go/csi._Node_NodeGetVolumeStats_Handler({0x1563e40?, 0xc000010048}, {0x18c6a18, 0xc000441d70}, 0xc000233680, 0x176c5b0) │
│ nfs /workspace/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:7108 +0x135
│ nfs google.golang.org/grpc.(*Server).processUnaryRPC(0xc0002d6000, {0x18c6a18, 0xc000441ce0}, {0x18cc120, 0xc0000a3d40}, 0xc0003465a0, 0xc00069c750, 0x23db280, 0x0) │
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1372 +0xe03
│ nfs google.golang.org/grpc.(*Server).handleStream(0xc0002d6000, {0x18cc120, 0xc0000a3d40}, 0xc0003465a0)
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1783 +0xfec
│ nfs google.golang.org/grpc.(*Server).serveStreams.func2.1()
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1016 +0x59
│ nfs created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 86
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1027 +0x115

@andyzhangx
Copy link
Member Author

@GImmekerAFP that's introduced by another PR, this PR will fix it: #576

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High number of threads on csi-nfs-node.
3 participants