Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maximumCapacity to taskRunner #17107

Merged
merged 3 commits into from
Oct 7, 2024

Conversation

georgew5656
Copy link
Contributor

Move logic for calculating maximumCapacity (tatal capacity based on max workers from autoscaling) to task runners

Description

It is possible to get weird responses from the /totalWorkerCapacity endpoint if mmless ingestion is enabled and the overlord dynamic.autoscaler config is set. This is because the TaskQueryTool.getTotalWorkerCapacity gets totalCapacity from the overlord's task runner but gets maximumCapacity directly from the dynamic config.

I think it makes sense to just expose a getMaximumCapacity method on TaskRunner that defaults to -1 (the default value of maximumCapacity) and gets overwritten.

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

In order to move getMaximumCapacity logic into each TaskRunner, i had to duplicate logic between HttpRemoteTaskRunner and RemoteTaskRunner since they have the same logic (check the overlord dynamic configuration), but I think this is okay in order for the task runners to be responsible for their own max capacity values.

Release note

  • TaskRunner should expose a getMaximumCapacity field
Key changed/added classes in this PR
  • TaskRunner
  • HttpRemoteTaskRunner/RemoteTaskRunner
  • KubernetesAndWorkerTaskRunner/KubernetesTaskRunner
  • TaskQueryTool

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, left some minor suggestions.

@@ -1648,6 +1649,30 @@ public int getTotalCapacity()
return getWorkers().stream().mapToInt(workerInfo -> workerInfo.getWorker().getCapacity()).sum();
}

@Override
public int getMaximumCapacity()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a javadoc here listing the cases when this method returns -1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this method body should be commoned out between RemoteTaskRunner and HttpTaskRunner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were you thinking of introducing a new class to do this?

@georgew5656 georgew5656 requested a review from kfaraz October 1, 2024 14:48
@georgew5656 georgew5656 merged commit 5d7c7a8 into apache:master Oct 7, 2024
90 checks passed
@adarshsanjeev adarshsanjeev added this to the 32.0.0 milestone Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants