Conversation
Codecov Report
@@ Coverage Diff @@
## master #9662 +/- ##
=============================================
- Coverage 70.42% 25.12% -45.31%
+ Complexity 5662 44 -5618
=============================================
Files 1991 1979 -12
Lines 107251 106924 -327
Branches 16302 16269 -33
=============================================
- Hits 75533 26861 -48672
- Misses 26444 77351 +50907
+ Partials 5274 2712 -2562
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
| if (idealState == null) { | ||
| throw new IllegalStateException("ideal state is empty"); | ||
| } | ||
| if (externalView != null && isExternalViewConverged(tableNameWithType, externalView.getRecord().getMapFields(), |
There was a problem hiding this comment.
Do we want to consider the situation where external view never converges with what ongoing rebalancer expects or external view ends up having segments in ERROR state ?
Looks like we will continue returning IN_PROGRESS in such scenarios and that could be misleading ? Probably also depends on what is the call to action for the user/admin who uses this API...
There was a problem hiding this comment.
HI @siddharthteotia, sorry I updated the above code before reading this comment.Currently, the external view returns false if it is not converged yet and logs an error if the external view ends up having segments in error .I believe it is best to return the status FAILED instead of logging. Also, would it be ok to return NO_OP if the external view is still null when the api is called .cc @Jackie-Jiang
|
|
||
| if (isExternalViewConverged(tableNameWithType, externalView.getRecord().getMapFields(), | ||
| idealState.getRecord().getMapFields(), true)) { | ||
| return RebalanceResult.Status.DONE; |
There was a problem hiding this comment.
I am not sure if it makes sense / helpful enough to sort of implement a stateless API to get status of rebalance task -- which was triggered separately through some other API and is being done asynchronously on potentially a different controller that getStatus() API call comes on
Relying on the fact that ExternalView is not the same as IdealState to indicate Rebalance is IN_PROGRESS could be misleading.
Segment assignment changes ideal state as soon as segment lands on controller and up until server successfully downloads the segment etc, ExternalView does not get updated. But this API will return rebalance is in IN_PROGRESS even though rebalance may have already finished or could have errored out etc.
So not sure if this is going to be super helpful / accurate.
There was a problem hiding this comment.
HI, @siddharthteotia Yes it will be quite misleading. Checking for IN_PROGRESS seems tricky here.
The possible way I can think for now through testing I have done on TableRebalancerClusterStatelessTest is to check on the external view from being null to being allocated segments assignments.
For DONE status, I will run the rebalance in dry mode and retrieve the new assignments from the result and use is checking if the extervalViewConverged . Kindly let me know you thoughts on it . cc @Jackie-Jiang
Jackie-Jiang
left a comment
There was a problem hiding this comment.
The API added in this PR basically checks if the table's EV matches IS. This itself cannot represent the status of rebalance because rebalance can be done in multiple steps, and in the middle EV can match IS. But this is also a quite useful endpoint for status tracking purpose.
I'd suggest changing the API to return whether the table has EV converged, and if not, list down the segments that do not have the same state in EV and IS.
|
@Jackie-Jiang Please review with the new update |
| @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName, | ||
| @ApiParam(value = "OFFLINE|REALTIME", required = true) @QueryParam("type") String tableTypeStr) { | ||
| try { | ||
| String tableNameWithType = constructTableNameWithType(tableName, tableTypeStr); |
There was a problem hiding this comment.
(code format) Please reformat the changes with Pinot Style
| } | ||
|
|
||
| @GET | ||
| @Path("/tables/{tableName}/rebalance/status") |
There was a problem hiding this comment.
Let's make the API more specific, and it might be better to put it under the segment rest API: /segments/{tableName}/externalViewMismatch and returns a map from the mismatch segment to its ideal state and external view
|
|
||
| public Map<String, MapDifference.ValueDifference<Map<String, String>>> rebalanceTableStatus(String tableNameWithType) | ||
| throws TableNotFoundException { | ||
| TableConfig tableConfig = getTableConfig(tableNameWithType); |
There was a problem hiding this comment.
No need to read the table config. We should check both IS and EV exist
| } | ||
| } | ||
|
|
||
| public Map<String, MapDifference.ValueDifference<Map<String, String>>> rebalanceStatus(String tableNameWithType, |
There was a problem hiding this comment.
Don't put this into the TableRebalancer. We can directly have it in PinotHelixResourceManager
| try { | ||
| tableType = TableType.valueOf(tableTypeStr.toUpperCase()); | ||
| } catch (Exception e) { | ||
|
|
There was a problem hiding this comment.
(nit) Please revert some unnecessary empty line changes in this PR
|
Hi @Jackie-Jiang , Please review with the new update |
Jackie-Jiang
left a comment
There was a problem hiding this comment.
The overall direction looks good
| @ApiOperation(value = "Table segment mismatch after rebalance", notes = "Get segments without the same state in EV " | ||
| + "and IS ") |
There was a problem hiding this comment.
(minor) Let's not mention rebalance here
| @ApiOperation(value = "Table segment mismatch after rebalance", notes = "Get segments without the same state in EV " | |
| + "and IS ") | |
| @ApiOperation(value = "Get segments with mismatched state in EV and IS", | |
| notes = "Get segments without the same state in EV and IS ") |
| try { | ||
| TableType tableType = Constants.validateTableType(tableTypeStr); | ||
|
|
||
| if (tableType == null) { | ||
| throw new ControllerApplicationException(LOGGER, "Table type must not be null", Status.BAD_REQUEST); | ||
| } | ||
| String tableNameWithType = TableNameBuilder.forType(TableType.valueOf(tableTypeStr)).tableNameWithType(tableName); | ||
|
|
||
| if (!_pinotHelixResourceManager.hasTable(tableNameWithType)) { | ||
| throw new TableNotFoundException(String.format("Table=%s not found", tableName)); | ||
| } | ||
|
|
There was a problem hiding this comment.
We already have util to perform the same checks. You may refer to listSegmentLineage()
TableType tableType = Constants.validateTableType(tableTypeStr);
if (tableType == null) {
throw new ControllerApplicationException(LOGGER, "Table type should either be offline or realtime",
Response.Status.BAD_REQUEST);
}
String tableNameWithType =
ResourceUtils.getExistingTableNamesWithType(_pinotHelixResourceManager, tableName, tableType, LOGGER).get(0);
|
|
||
| _helixResourceManager.deleteOfflineTable(TIERED_TABLE_NAME); | ||
| } | ||
| @Test |
There was a problem hiding this comment.
Suggest not adding the rebalance status check test in this PR. This PR is focusing on the EV mismatch, and we can use it to build the rebalance status check in a separate PR and add the test there.
| try { | ||
| tableType = TableType.valueOf(tableTypeStr.toUpperCase()); | ||
| } catch (Exception e) { | ||
|
|
There was a problem hiding this comment.
(minor) Revert this change
| String rawTableName = TableNameBuilder.extractRawTableName(tableNameWithType); | ||
| boolean hasPreConfiguredInstancePartitions = TableConfigUtils.hasPreConfiguredInstancePartitions(tableConfig, | ||
| instancePartitionsType); | ||
| boolean hasPreConfiguredInstancePartitions = |
There was a problem hiding this comment.
(minor) Revert the changes in this file since it is not related
| ControllerTest.sendPostRequest(createTableUrl, testTableConfig.toJsonString()); | ||
|
|
||
| // Rebalance table | ||
| ControllerTest.sendPostRequest( |
There was a problem hiding this comment.
There is no segment in this table, so both rebalance and segment mismatch will be no-op. Let's try to generate a non-empty IS and EV with some diff, and verify the behavior
| Map<String, MapDifference.ValueDifference<Map<String, String>>> view = | ||
| _pinotHelixResourceManager.getExternalViewSegementMismatch(tableNameWithType); | ||
| ObjectNode data = JsonUtils.newObjectNode(); | ||
| data.put("segments", view.toString()); |
There was a problem hiding this comment.
I feel we might want the response to be the following format (segmentName -> EV/IS -> different instanceState):
{
"segment0": {
"IdealState": {
"server0": "ONLINE",
"server1": "ONLINE"
},
"externalView": {
"server1": "ERROR",
"server2": "ONLINE"
}
},
"segment1": {
"IdealState": {
"server1": "ONLINE"
},
"externalView": {
"server1": "ERROR"
}
}
}
|
Hi @Jackie-Jiang , an update has been provided for this pr. Kindly review |
|
Decided to move forward with #10359 |
Description:
This PR adds an api for checking table rebalance status
This PR will solve the following issue --#9558
cc @Jackie-Jiang