Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the priority of the agent to its instance integer #539

Merged
merged 1 commit into from
Feb 28, 2019

Conversation

tduffield
Copy link
Contributor

@tduffield tduffield commented Feb 26, 2019

This will ensure that we optimize for leveraging the instances that are available to us, rather than unintentionally packing all jobs onto a single instance.

Fixes #540

Signed-off-by: Tom Duffield tom@chef.io

This will ensure that we optimize for leveraging the instances that are
available to us, rather than unintentionally packing all jobs onto a
single instance.

Signed-off-by: Tom Duffield <tom@chef.io>
@tduffield
Copy link
Contributor Author

@lox we're running into a significant amount of grief in our pipelines that this setting would alleviate. How do we feel about this? Would love to be able to get this merged if possible.

@lox
Copy link
Contributor

lox commented Feb 28, 2019

Sorry for the delay, I should ACK these initially and let you know I'm thinking them through. Basically what this accomplishes is a "spread" scheduling mode. I've been wondering if a) whether this should just be done server side and b) what impact this has on autoscaling where you have multiple agents per node. It makes it harder to scale instances in when jobs aren't packed to a node.

That said, after thinking those things, I'm inclined to just ship this and see how it goes. Thanks @tduffield!

@lox
Copy link
Contributor

lox commented Feb 28, 2019

Btw, I'd love to hear a bit more about what problems you are seeing. Presumably jobs scheduling on the one host when there are plenty of spare instances?

@lox lox merged commit f8bed43 into buildkite:master Feb 28, 2019
@tduffield
Copy link
Contributor Author

tduffield commented Mar 1, 2019

@lox that's the gist of it yes. Our issue is textbook resource contention. Some steps take very few resources. Others take much more. I don't know the mechanics of the scheduling, but we always seemed to have all the resource heavy jobs scheduled on a single instance, maybe because they were defined back-to-back in our pipeline definition.

In an ideal world, we could define our resource requirements and buildkite would take that into account when scheduling jobs. However, the priority setting was our best best to at least try and spread out the load.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants