Skip to content

Rework concurrency #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Rework concurrency #123

wants to merge 2 commits into from

Conversation

nevillelyh
Copy link
Contributor

@nevillelyh nevillelyh commented Jun 26, 2025

  • Add concurrency.{max,current} to /health-check endpoint
    • For regular models, max is from cog.yaml and current is len(pending) of the sole runner
    • For procedure mode, max == maxRunners and current is the sum of len(pending) across all runners

For procedure mode:

  • Set maxRunners = # CPU x 4, this is also the Cog server max concurrency
  • We respect concurrency.max in each runner's cog.yaml, which is 1 for non-async runners
  • We will spin up multiple copies of the same runner if its concurrency.max has been reached, e.g. 4 copies for a async runner with concurrency.max = 2 for a total of 8 concurrent predictions
  • However we will reject new predictions when # pending == maxRunners, to prevent context switching, even if any runner might claim a higher concurrency.max, e.g. with # CPU = 1, maxRunners = 4, 4 predictions on a single runner even if its concurrency.max = 16
  • This way when server concurrency.current < concurrency.max, we're guaranteed to have an evict-able runner slot.

@nevillelyh nevillelyh requested a review from a team as a code owner June 26, 2025 20:41
@nevillelyh nevillelyh force-pushed the neville/concurrency branch 3 times, most recently from e8a6c53 to 8ff80b9 Compare June 27, 2025 15:39
@nevillelyh nevillelyh force-pushed the neville/concurrency branch from 8ff80b9 to db0d861 Compare June 27, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant