-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic Flexvolume plugin discovery, probing with filesystem watch. #50031
Conversation
@verult: GitHub didn't allow me to request PR reviews from the following users: kokhang, bassam, chakri-nelluri. Note that only kubernetes members can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -90,7 +90,7 @@ func NewController(p ControllerParameters) (*PersistentVolumeController, error) | |||
resyncPeriod: p.SyncPeriod, | |||
} | |||
|
|||
if err := controller.volumePluginMgr.InitPlugins(p.VolumePlugins, controller); err != nil { | |||
if err := controller.volumePluginMgr.InitPlugins(p.VolumePlugins, nil, controller); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add volume.DynamicPluginProber
to controllerparameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK PV controller was never aware of Flexvolume.
b2accc5
to
7737dfb
Compare
Added filesystem watch implementation of the prober. |
@verult I tested this locally and it did work. Kubelet was able to load the Rook flexvolume without having to restart. Here is the log: However there are still some concerns that i have listed in kubernetes/community#833. Namely about how to obtain or construct a client config so that the driver can talk to the K8S api. |
/cc @msau42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a first pass
@@ -90,7 +90,7 @@ func NewController(p ControllerParameters) (*PersistentVolumeController, error) | |||
resyncPeriod: p.SyncPeriod, | |||
} | |||
|
|||
if err := controller.volumePluginMgr.InitPlugins(p.VolumePlugins, controller); err != nil { | |||
if err := controller.volumePluginMgr.InitPlugins(p.VolumePlugins, nil, controller); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Whenever you pass in a parameter where it is not obvious what it is, add a small inline comment with the paramter name, for clarity:
if err := controller.volumePluginMgr.InitPlugins(p.VolumePlugins, nil /* prober */, controller); err != nil {
Also add an inline comment explaining why you are leaving the value nil. e.g. PV controller was never aware of Flexvolume...
pkg/volume/flexvolume/probe.go
Outdated
} | ||
} | ||
|
||
if time.Since(prober.lastUpdated) > time.Second { // Reduce the frequency of probes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add to const at top of file.
pkg/volume/flexvolume/probe.go
Outdated
} | ||
|
||
if eventOpIs(event, fsnotify.Remove) && eventPathAbs == pluginDirAbs { | ||
// pluginDir needs to exist in order to be watched. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post a warning to log.
pkg/volume/flexvolume/probe.go
Outdated
return err | ||
} | ||
|
||
if eventOpIs(event, fsnotify.Remove) && eventPathAbs == pluginDirAbs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment above this block indicating what you're doing and why.
pkg/volume/flexvolume/probe.go
Outdated
prober.lastUpdated = time.Now() | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race here if files are dropped quickly in succession, and the read happens before the 2nd event is processed.
Let's make sure to add testing to see how often this might happen, and if it is infrequent put a comment indicating you know about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing this by adding a limit on number of events processed per second, and ignoring events from .dot files.
Users need to install drivers atomically, so if there are ever multiple files in an installation, O(1) events are triggered regardless. Multiple events are triggered only when multiple Flexvolume drivers are installed.
I know this is unrelated but what do you recommend to deploy the flexvolume drivers on nodes that are not configured to have the same |
@kokhang There was some discussion about this topic here: kubernetes/community#833 (comment) |
I would recommend reporting the issue to kubeadm as a bug… I wouldn't expect that to be a desired deployment configuration edit: is the controller-manager making use of flex plugins net new to this PR? If so, it wouldn't be a bug, it would be part of adding this enhancement |
@liggitt ill contact to the kubeadm folks. But im also wondering if the controller-manager needs to probe for the flexvolume driver as well? Or is this case already handled by this kubelet probe? |
In order to perform attach-detach operations, the controller-manager would have to probe for Flexvolume driver. One possible solution for now, while waiting for a response from the kubeadm team, is to use one of controller-manager pod's existing hostpath. I'm not sure if the kubeadm installation uses this but, /etc/srv/kubernetes might be a possibility. |
Does this mean that this dynamic flexvolume discovery fix is only applicable for non attacher-detacher-controller based flexvolume plugins? |
I just saw the code. It seems you are doing the probe in the controller-manager as well |
/lgtm |
/lgtm |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: saad-ali, verult Assign the PR to them by writing No associated issue. Update pull-request body to add a reference to an issue, or get approval with The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest Review the full test history for this PR. |
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173) |
Automatic merge from submit-queue (batch tested with PRs 51805, 51725, 50925, 51474, 51638) Flexvolume dynamic plugin discovery: Prober unit tests and basic e2e test. **What this PR does / why we need it**: Tests for changes introduced in PR #50031 . As part of the prober unit test, I mocked filesystem, filesystem watch, and Flexvolume plugin initialization. Moved the filesystem event goroutine to watcher implementation. **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51147 **Special notes for your reviewer**: First commit contains added functionality of the mock filesystem. Second commit is the refactor for moving mock filesystem into a common util directory. Third commit is the unit and e2e tests. **Release note**: ```release-note NONE ``` /release-note-none /sig storage /assign @saad-ali @liggitt /cc @mtaufen @chakri-nelluri @wongma7
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173) Dynamic Flexvolume plugin discovery, probing with filesystem watch. **What this PR does / why we need it**: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory. This PR uses the dependency injection model in kubernetes#49668. **Release Note**: ```release-note Dynamic Flexvolume plugin discovery. Flexvolume plugins can now be discovered on the fly rather than only at system initialization time. ``` /sig-storage /assign @jsafrane @saad-ali /cc @bassam @chakri-nelluri @kokhang @liggitt @thockin
What this PR does / why we need it: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory.
This PR uses the dependency injection model in #49668.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #51470Release Note:
/sig-storage
/assign @jsafrane @saad-ali
/cc @bassam @chakri-nelluri @kokhang @liggitt @thockin