Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing a case of dangling endpoint during ungraceful daemon restart #17514

Merged
merged 2 commits into from
Oct 30, 2015

Conversation

mavenugo
Copy link
Contributor

fixes #17413

As described in #17413 , its a tricky reproduction case & writing an IT for this case is a bit more involved.
We rely on the e2e libnetwork IT for any overlay related cases (also some of the ungraceful restart scenarios).

Signed-off-by: Madhu Venugopal <madhu@docker.com>
When a container restarts after a ungraceful daemon restart, first
cleanup any unclean sandbox before trying to allocate network resources.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
@tiborvass
Copy link
Contributor

@mavenugo TestLinksHostsFilesInject is failing on userns, can you doublecheck?

@mavenugo
Copy link
Contributor Author

@tiborvass I dont think this is related to this PR. I will try to retrigger the userns for now. But I will try to reproduce it to see whats going on.

@mavenugo
Copy link
Contributor Author

@tiborvass seems like a flaky one. it finished fine this time. PTAL

@LK4D4
Copy link
Contributor

LK4D4 commented Oct 30, 2015

Maybe it fixes #15857 as well? I'm pretty tired to remove hundreds of namespaces and veth pairs :)

@mavenugo
Copy link
Contributor Author

@LK4D4 we do a cleanup of leaked network resources (caused by ungraceful restart) during daemon start. this was addressed few days back in another PR.
This PR addresses a special case of ungraceful daemon restart as described in the commit message.

@tiborvass
Copy link
Contributor

LGTM

@vdemeester
Copy link
Member

Would there be a way to write a test on that ? Like killing a daemon & such 😅

@LK4D4
Copy link
Contributor

LK4D4 commented Oct 30, 2015

@vdemeester Indeed, and we already have similar test for mounts.

@mavenugo
Copy link
Contributor Author

@vdemeester @LK4D4 yes we could. But the trouble here is the rest of the entourage for testing it e2e. like a KV-Store and a valid remote network driver with the valid driver making use of this KV-Store and the container using that driver for attaching to the network. All these tests are in libnetwork though, where we use BATS like https://github.com/docker/libnetwork/blob/master/test/integration/dnet/overlay-consul.bats . We have plans of porting all of them to docker/docker.

@tiborvass
Copy link
Contributor

@LK4D4 @vdemeester as much as I want to have a test too, this one apparently was very hard to reproduce reliably. I'm fine with this PR as-is.

@cpuguy83
Copy link
Member

Confirmed this fixes the behavior.
LGTM

cpuguy83 added a commit that referenced this pull request Oct 30, 2015
Fixing a case of dangling endpoint during ungraceful daemon restart
@cpuguy83 cpuguy83 merged commit 17a8fbe into moby:master Oct 30, 2015
@tiborvass
Copy link
Contributor

Thanks @cpuguy83

@vdemeester
Copy link
Member

@tiborvass yeah no problemo, was just thinking out loud 😝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Container attached to an overlay network does not restart after reboot.
6 participants