Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Cache Implementation #35

Closed
leakingtapan opened this issue Oct 3, 2018 · 5 comments · Fixed by #274
Closed

Improve Cache Implementation #35

leakingtapan opened this issue Oct 3, 2018 · 5 comments · Fixed by #274
Assignees
Labels
priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@leakingtapan
Copy link
Contributor

leakingtapan commented Oct 3, 2018

The current cache implement has some issues that could be improved:

  • It requires explicit Deprioritize each time GetNext is called. The only time Deprioritize is called is after GetNext, we can combine the two methods together to help on maintainability and understanbility and reduce the chance that someone forget to call Deprioritize
  • The behavior of GetNext is nondeterministic, this is caused by sort.Sort which uses quick sort that is unstable sort.
  • Cache is not invalidated after volume is detached.

Proposal:

  • Maintain a set of device names that are current attached
  • Calling Next() each time attaching a volume. Inside Next() call, looping through all the combinations of device names and returns the first one that is not in the cache set. This won't cause any performance issue, since the loop iteration count is very small (in the scale of number of device names, less than 100)
  • Calling Remove(deviceName) each time detaching a volume. Inside Remove() call, remove the device from the device name set.
@leakingtapan leakingtapan self-assigned this Oct 3, 2018
@leakingtapan leakingtapan added this to the 0.3-alpha milestone Oct 3, 2018
@leakingtapan
Copy link
Contributor Author

/cc @bertinatto

Let me know how do you think of the problem and my proposal

@leakingtapan
Copy link
Contributor Author

Ah. Found out that this what the in-tree implements as well: https://github.com/kubernetes/kubernetes/blob/v1.12.2/pkg/cloudprovider/providers/aws/device_allocator.go#L42

@leakingtapan
Copy link
Contributor Author

/cc @jsafrane @gnufied

@bertinatto
Copy link
Member

Ah. Found out that this what the in-tree implements as well

Yes, I decoupled the device management code from k8s an created the package devicemanager package (and added tests).

The behavior of GetNext is nondeterministic, this is caused by sort.Sort which uses quick sort that is unstable sort.

This is also from in-tree. What kind of problem can we have with sort.Sort?

Cache is not invalidated after volume is detached.

I'm not sure if I'm following. The device release is deferred even when it has a taint.

@gnufied
Copy link
Contributor

gnufied commented Oct 31, 2018

I thought the reason GetNext and Deprioritize were separate functions is because we wanted to Deprioritize only after device has been successfully attached. There is already another mechanism that prevents concurrent use of same device names between attaches.

GetNext is indeed non-deterministic and feel free to change it. At the end of day though, the device allocator is a band-aid that exists solely because we have to supply device names on attach. The real fix should be in AWS API somewhere IMO.

@leakingtapan leakingtapan modified the milestones: alpha, beta Nov 5, 2018
@leakingtapan leakingtapan added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Nov 16, 2018
@leakingtapan leakingtapan modified the milestones: beta, GA Feb 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants