Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let the controller getTableInstance() call to return the list of live brokers of a table. #7556

Merged
merged 11 commits into from
Oct 29, 2021

Conversation

chenboat
Copy link
Contributor

Description

Fix the issue described in #7491. Today the controller getTableInstance api gets the brokers serving a table based on the table's broker tenant config. It fetches the brokers list based only on the tenant info. Some of the brokers in the returned list may not in fact serve the table if they are just added to the tenant. This PR fixes the issue by fetching the brokers from the broker external view.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

@chenboat chenboat requested a review from Jackie-Jiang October 11, 2021 19:21
@codecov-commenter
Copy link

codecov-commenter commented Oct 11, 2021

Codecov Report

Merging #7556 (f13b213) into master (9ef1f49) will decrease coverage by 2.27%.
The diff coverage is 60.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7556      +/-   ##
============================================
- Coverage     72.67%   70.39%   -2.28%     
- Complexity     3413     4021     +608     
============================================
  Files          1524     1578      +54     
  Lines         75702    80219    +4517     
  Branches      11039    11908     +869     
============================================
+ Hits          55017    56473    +1456     
- Misses        16995    19878    +2883     
- Partials       3690     3868     +178     
Flag Coverage Δ
integration1 ?
integration2 27.73% <0.00%> (-1.41%) ⬇️
unittests1 68.61% <ø> (-1.28%) ⬇️
unittests2 14.54% <60.00%> (-0.73%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../controller/api/resources/PinotTableInstances.java 0.00% <0.00%> (ø)
...ntroller/helix/core/PinotHelixResourceManager.java 63.84% <64.28%> (-1.08%) ⬇️
...pinot/minion/exception/TaskCancelledException.java 0.00% <0.00%> (-100.00%) ⬇️
...nverttorawindex/ConvertToRawIndexTaskExecutor.java 0.00% <0.00%> (-100.00%) ⬇️
...plugin/segmentuploader/SegmentUploaderDefault.java 0.00% <0.00%> (-87.10%) ⬇️
...ore/startree/executor/StarTreeGroupByExecutor.java 0.00% <0.00%> (-86.67%) ⬇️
.../transform/function/MapValueTransformFunction.java 0.00% <0.00%> (-85.30%) ⬇️
...ot/common/messages/RoutingTableRebuildMessage.java 0.00% <0.00%> (-81.82%) ⬇️
...e/pinot/common/minion/MergeRollupTaskMetadata.java 0.00% <0.00%> (-78.27%) ⬇️
...ache/pinot/common/lineage/SegmentLineageUtils.java 22.22% <0.00%> (-77.78%) ⬇️
... and 336 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9ef1f49...f13b213. Read the comment docs.

@chenboat chenboat requested a review from yupeng9 October 11, 2021 21:40
Copy link
Contributor

@mqliang mqliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments. LGTM otherwise

return getTableToBrokersMapFromExternalView(true, false);
}

private Map<String, List<String>> getTableToBrokersMapFromExternalView(boolean tableWithType, boolean useUrlFormat) {
Map<String, Set<String>> brokerUrlsMap = new HashMap<>();
try {
byte[] brokerResourceNodeData = _zkClient.readData("/EXTERNALVIEW/brokerResource", true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reuse the BROKER_EXTERNAL_VIEW_PATH constant to replace String "/EXTERNALVIEW/brokerResource" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

public Map<String, List<String>> getTableToBrokersMap() {
return getTableToBrokersMapFromExternalView(false, true);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil: newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

public List<String> getLiveBrokersForTable(String tableNameWithType) {
ExternalViewReader externalViewReader = new ExternalViewReader(new ZkClient(_helixZkURL));
Map<String, List<String>> tableToBrokersMap = externalViewReader.getTableWithTypeToRawBrokerInstanceIdsMap();
return tableToBrokersMap == null ? null : tableToBrokersMap.get(tableNameWithType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we return empty map other than null?

Copy link
Contributor Author

@chenboat chenboat Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised to return empty map.

@chenboat chenboat requested a review from yupeng9 October 12, 2021 18:31
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not modify the existing API and change its behavior. Users might rely on it for other purpose other than routing the queries.
I'd suggest adding another API to get the live brokers for a given table

@ApiResponse(code = 500, message = "Internal server error")})
public String getTableBrokers(
@ApiParam(value = "Table name without type", required = true) @PathParam("tableName") String tableName) {
ObjectNode ret = JsonUtils.newObjectNode();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not reuse the response format for this new API. Would suggest just returning an array of live brokers.

  • If the input is a table name with type, returns the live brokers for the table, or 404 if table is not found
  • If the input is a raw table name, if both offline and realtime table exist, return the brokers live for both tables (we should only route query to the broker with both tables online); if only one table exist, return the live brokers for the single table; if no table is found, return 404

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this API to return as close to what the zookeeper stores as possible. Today the zk stores one typed table entries. It is better to returns the typed table entries as well. In this way this API can serve multiple purposes than just querying routing. For example, one may even use it to detect if there is any problem with hybrid table's broker assignment. We can leave it to the client to do what they want to do with the returned results -- they can do intersection if they want to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, 99% of the usage for this API would be to find the brokers to route the query, and returning brokers for each individual table and let the user do the intersection is not easy to use, and can confuse the user. If we do want to get the brokers for the individual table, we can put the table name with type. This way, the API is easy to use, and it can also achieve the debugging purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor to return the list of live brokers given a tablename with type.

@chenboat chenboat requested a review from Jackie-Jiang October 26, 2021 23:38
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

@chenboat chenboat merged commit fbb9e8e into apache:master Oct 29, 2021
@chenboat chenboat deleted the table_broker_fix branch November 1, 2021 17:29
kriti-sc pushed a commit to kriti-sc/incubator-pinot that referenced this pull request Dec 12, 2021
… brokers of a table. (apache#7556)

* Let getTableInstance() call to return the list of live brokers.

* Fix lint errors.

* Fix lint errors.

* Fix unit tests

* Revise based on comments.

* Fix based on comments.

* Add a new controller end point.

* Fix linters.

* Revise based on comments

* Revise based on comments

* Revised based on comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants