Skip to content

batches: add a Docker scheduler #790

Open
@LawnGnome

Description

@LawnGnome

From What happens when good Dockers go bad?:

We might not be able to predict the future resource usage of a container, but we can observe the usage of the containers that we’ve already started. We could use that to dynamically adjust the maximum number of parallel jobs down if it appears that we’re trending towards a memory exhaustion scenario. (Or adjust them up if there’s lots of idle CPU and free memory!)

Implementing a full blown scheduler might be overkill, but there are some very basic heuristics that we could start with here. The main drawback is that we’d probably have to slow the spawning of the initial set of containers to measure what happens (since it’s unhelpful if you start a thundering herd that immediately exhausts memory before you can do anything about it), so we’d probably only want to do this if there were a significant number of workspaces and steps.

If we do want to invest after actioning the earlier options, I think I’d want to start with something super simple. Monitor docker stats, spawn a container every couple of seconds, adjust down only based on average memory usage of each container. I don’t really want to reinvent a full blown auto-scaler here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions