Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-cloud instance caps #616

Merged
merged 3 commits into from
Feb 13, 2018
Merged

Conversation

pjdarton
Copy link
Member

@pjdarton pjdarton commented Feb 9, 2018

  1. BugFix: if the plugin is talking to a very busy docker daemon, it can get more than 1meg of data back from the ListContainersCmd and throw an exception, which stops it from starting further containers.
    This code changes the parameters we pass to ListContainers so that we only ask about containers that we started ourselves, thus excluding any other containers from the returned results.
  2. BugFix: if the plugin has multiple clouds mentioning the same image, it was possible for it to get confused when counting the number of containers being constructed. This change ensures that each cloud has its own counters so there's no longer pollution between them.
  3. BugFix: the code responsible for tracking the count of containers in progress was able to loose track. We now use try/finally code to ensure it cannot get out of step even if exceptions happen.
  4. Behavioral change: cloud-level instance caps now only count instances started by "our" Jenkins and no longer count containers running within docker that weren't started by our Jenkins.
  5. Performance improvement: If you don't set a template container instance cap, we don't ask docker to count them. If you don't set a cloud-level container instance cap, we don't ask docker to count them. If you don't set either cap then we don't hammer the "ListContainers" API at all, resulting in improved speed and stability of the Jenkins cloud provisioning process overall.
  6. Logging changes: DockerCloud's "Will provision ..." log messages now provide more information, showing current counts and caps, when those caps are being used.

The docker-java ListContainersCmdExec class used by the
NettyDockerCmdExecFactory calls upon Jersey functionality to implement
the filter.  This results in a classloading exception at runtime as
we're using Netty, and not Jersey.
This error is repeated in a number of xxxCmdExec classes, but this
commit only fixes it for the ListContainers command as that's the only
one we need fixed right now.
Previously, the plugin called the docker API (twice), asking for details
of all running docker containers.  In situations where it's talking to a
busy (shared) docker swarm, this sometimes returned over a megabyte of
data causing the API call to fail (the docker-java library can only
handle a maximum of 1meg of returned data).
With this change, we now only call the docker API to ask about our own
containers (using a label filter), greatly reducing the amount of data
returned in shared situations, and we avoid calling the docker API at
all if we do not have instance limits set.
i.e. smaller docker calls, less often.
This does now mean that per-cloud instance limits only count the
containers run by "our Jenkins instance" on that cloud, and no longer
counts containers run by other Jenkins instances on the same docker.
Added unit-test for the reworked code that counts the number of
containers that are in the process of being spun up.
@pjdarton pjdarton changed the title NO READY YET: Per cloud instance caps Per-cloud instance caps Feb 9, 2018
@pjdarton
Copy link
Member Author

@ndeloof I've been running this for a few days now, it seems to work quite well.
I would recommend that it be merged (along with #617) and we do a release.

I've also come to the conclusion that the docker-java library's support of Netty is buggy - it's apparent that Netty isn't a priority for docker-java and that it isn't tested prior to release - every time I try to use functionality we've not used before, I find yet another bug. I suspect that we (the docker-plugin) are the only user of its Netty functionality and I therefore question whether or not it's worthwhile raising bug reports on it.

  • You mentioned before that you'd like to swap to another docker library: I'm coming to the conclusion that this is a good idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants