-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cephfs concurrency fix #232
Conversation
PTAL @rootfs |
return err | ||
} | ||
|
||
if err := os.Rename(volRootCreating, volRoot); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gman0 just a question why renaming is required?
can't we just use volRoot
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because creating a volume is a non-atomic operation and we need to somehow represent the state of the creation process - so that we can distinguish between a successfully created volume and a volume that's still being created (or rather was but failed, since we now have mutexes).
Imagine following scenario:
- incoming
CreateVolume
RPC - read ceph admin credentials
- mount ceph root
- create in-cluster
/csi-volumes/{volume-id}
- start setting the attributes but one of the attributes fails
On the next attempt to CreateVolume
, we see /csi-volumes/{volume-id}
- which means that the volume has been created successfully? Maybe not, who knows - we can't say.
Appending the -creating
part to the path helps us decide if the volume exists: in which case we assume the creation process was successful, and we can return success immediately; in other case the procedure is retried again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gman0 Thanks for the great explanation 👍
if err := createMountPoint(cephRoot); err != nil { | ||
return err | ||
if pathExists(volRoot) { | ||
if err := os.Rename(volRoot, volRootDeleting); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, can we just delete volRoot. just want to understand the idea behind renaming
return volumeID("csi-cephfs-" + volName) | ||
func mustUnlock(m keymutex.KeyMutex, key string) { | ||
if err := m.UnlockKey(key); err != nil { | ||
klog.Fatalf("failed to unlock mutex for %s: %v", key, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
klog.Fatalf
calls os.Exit(255)
do we need this? as the lock uses the in-memory cache this can cause an issue I think,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing to unlock a mutex is such a critical error that imho the program should abort. The requests would just keep piling up anyway and they would never finish because the mutex is in unknown state (maybe still locked for whatever reason)
cephRoot := getCephRootPathLocal(volID) | ||
|
||
if err := unmountVolume(cephRoot); err != nil { | ||
klog.Errorf("failed to unmount %s with error %s", cephRoot, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: if unmount fails do we need to remove the directory?
if we continue to do os.Remove()
won't we end up in stale mount points?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If unmount fails, removing /should/ fail too. Good point though, no need to delete it if unmounting failed - will fix.
func makeVolumeID(volName string) volumeID { | ||
return volumeID("csi-cephfs-" + volName) | ||
func mustUnlock(m keymutex.KeyMutex, key string) { | ||
if err := m.UnlockKey(key); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlock always returns nil any idea why?
https://github.com/ceph/ceph-csi/blob/csi-v1.0/vendor/k8s.io/kubernetes/pkg/util/keymutex/hashed.go#L57
// Releases the lock associated with the specified ID.
func (km *hashedKeyMutex) UnlockKey(id string) error {
klog.V(5).Infof("hashedKeyMutex.UnlockKey(...) called for id %q\r\n", id)
km.mutexes[km.hash(id)%len(km.mutexes)].Unlock()
klog.V(5).Infof("hashedKeyMutex.UnlockKey(...) for id %q completed.\r\n", id)
return nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a guess but maybe this is a deliberate decision to panic()
on a non-existent mutex. mutex.Lock/Unlock
doesn't return an error anyway, but rather aborts internally: https://golang.org/src/sync/mutex.go
@@ -100,6 +105,9 @@ func (ns *NodeServer) NodeStageVolume(ctx context.Context, req *csi.NodeStageVol | |||
return nil, status.Error(codes.Internal, err.Error()) | |||
} | |||
|
|||
mtxNodeVolumeID.LockKey(string(volID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we take this lock at ns.mount
ceph-csi/pkg/cephfs/nodeserver.go
Line 118 in ae14956
if err = ns.mount(volOptions, req); err != nil {
or before creating stagingTarget at
ceph-csi/pkg/cephfs/nodeserver.go
Line 98 in ae14956
if err = createMountPoint(stagingTargetPath); err != nil {
@@ -143,6 +151,9 @@ func (cs *ControllerServer) DeleteVolume(ctx context.Context, req *csi.DeleteVol | |||
return nil, status.Error(codes.InvalidArgument, err.Error()) | |||
} | |||
|
|||
mtxControllerVolumeID.LockKey(string(volID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need the lock the volumeID before checking metadata store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have few question on locking placement, or else LGTM
cephfs concurrency fix
Syncing latest changes from upstream devel for ceph-csi
This PR adds:
CreateVolume
,DeleteVolume
,NodeStageVolume
. It doesn't make much sense to run these operations in parallel anyway, and there might be more harm than good if they didcreateVolume
,purgeVolume