-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing a case of dangling endpoint during ungraceful daemon restart #17514
Conversation
Signed-off-by: Madhu Venugopal <madhu@docker.com>
When a container restarts after a ungraceful daemon restart, first cleanup any unclean sandbox before trying to allocate network resources. Signed-off-by: Madhu Venugopal <madhu@docker.com>
@mavenugo TestLinksHostsFilesInject is failing on userns, can you doublecheck? |
@tiborvass I dont think this is related to this PR. I will try to retrigger the userns for now. But I will try to reproduce it to see whats going on. |
@tiborvass seems like a flaky one. it finished fine this time. PTAL |
Maybe it fixes #15857 as well? I'm pretty tired to remove hundreds of namespaces and veth pairs :) |
@LK4D4 we do a cleanup of leaked network resources (caused by ungraceful restart) during daemon start. this was addressed few days back in another PR. |
LGTM |
Would there be a way to write a test on that ? Like killing a daemon & such 😅 |
@vdemeester Indeed, and we already have similar test for mounts. |
@vdemeester @LK4D4 yes we could. But the trouble here is the rest of the entourage for testing it e2e. like a KV-Store and a valid remote network driver with the valid driver making use of this KV-Store and the container using that driver for attaching to the network. All these tests are in libnetwork though, where we use BATS like https://github.com/docker/libnetwork/blob/master/test/integration/dnet/overlay-consul.bats . We have plans of porting all of them to docker/docker. |
@LK4D4 @vdemeester as much as I want to have a test too, this one apparently was very hard to reproduce reliably. I'm fine with this PR as-is. |
Confirmed this fixes the behavior. |
Fixing a case of dangling endpoint during ungraceful daemon restart
Thanks @cpuguy83 |
@tiborvass yeah no problemo, was just thinking out loud 😝 |
fixes #17413
As described in #17413 , its a tricky reproduction case & writing an IT for this case is a bit more involved.
We rely on the e2e libnetwork IT for any overlay related cases (also some of the ungraceful restart scenarios).