Add support for uncordoning nodes #84

cezarsa · 2020-08-19T18:29:32Z

Following discussions in #27 (comment) it can be useful to draino to detect that a condition that triggered a node being cordoned is no longer present.

This PR introduces the ability for draino to track which conditions triggered the cordon+drain process in an annotation named draino.planet.com/conditions.
Whenever this annotation is present draino will check if the conditions are still present in the node, if they are not present anymore draino will try to uncordon the node and possibly skip draining the node if it wasn't scheduled yet.

~~I'm marking this as a draft because I'm still going to write a few unit tests, but the functionality is mostly ready and I've been able to test it on a real cluster.~~

One question, would the maintainers like for me to put this feature behind a flag (eg: --allow-uncordon)? I don't think it's dangerous to allow uncordoning but it can be unexpected for users upgrading draino.

Fixes #27

codecov · 2020-08-19T18:32:19Z

Codecov Report

Merging #84 into master will increase coverage by 2.49%.
The diff coverage is 78.57%.

@@            Coverage Diff             @@
##           master      #84      +/-   ##
==========================================
+ Coverage   74.18%   76.67%   +2.49%     
==========================================
  Files           7        7              
  Lines         488      553      +65     
==========================================
+ Hits          362      424      +62     
+ Misses        116      115       -1     
- Partials       10       14       +4

Impacted Files	Coverage Δ
internal/kubernetes/nodefilters.go	`92.45% <ø> (-1.49%)`	⬇️
internal/kubernetes/eventhandler.go	`67.83% <76.11%> (+24.08%)`	⬆️
internal/kubernetes/drainer.go	`82.77% <88.23%> (+0.35%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 46799e2...0f4b2e1. Read the comment docs.

jacobstr

This looks really lovely @cezarsa. I've merged a large MR ahead of yours. Would you be able to rebase and resolve the conflicts here?

jacobstr · 2020-09-11T07:02:15Z

internal/kubernetes/drainer_test.go

+			expected: &core.Node{ObjectMeta: meta.ObjectMeta{Name: nodeName}},
+		},
+		{
+			name: "UncordonUnschedulableNodeWithMutator",


I was looking for a test case where a node remains unschedulable even if draino has no reason to uncordon it.

...And I found it here: https://github.com/planetlabs/draino/pull/84/files#diff-45cd5412f9ec9294054951666bce4620R159

cezarsa · 2020-09-11T15:29:51Z

Thanks for taking a look at this. Rebased and conflicts solved.

cezarsa force-pushed the uncordon branch from fd5c21e to 0fc6648 Compare August 19, 2020 22:17

cezarsa marked this pull request as ready for review August 19, 2020 22:21

cezarsa force-pushed the uncordon branch from 0fc6648 to e87e4c2 Compare August 21, 2020 17:56

jacobstr approved these changes Sep 11, 2020

View reviewed changes

Add support for uncordoning nodes

0f4b2e1

cezarsa force-pushed the uncordon branch from e87e4c2 to 0f4b2e1 Compare September 11, 2020 14:15

jacobstr merged commit d92f02b into planetlabs:master Sep 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for uncordoning nodes #84

Add support for uncordoning nodes #84

cezarsa commented Aug 19, 2020 •

edited

Loading

codecov bot commented Aug 19, 2020 •

edited

Loading

jacobstr left a comment

jacobstr Sep 11, 2020

jacobstr Sep 11, 2020

cezarsa commented Sep 11, 2020

Add support for uncordoning nodes #84

Add support for uncordoning nodes #84

Conversation

cezarsa commented Aug 19, 2020 • edited Loading

codecov bot commented Aug 19, 2020 • edited Loading

Codecov Report

jacobstr left a comment

Choose a reason for hiding this comment

jacobstr Sep 11, 2020

Choose a reason for hiding this comment

jacobstr Sep 11, 2020

Choose a reason for hiding this comment

cezarsa commented Sep 11, 2020

cezarsa commented Aug 19, 2020 •

edited

Loading

codecov bot commented Aug 19, 2020 •

edited

Loading