Enhancing Table Size API to return table size per replica#10812
Enhancing Table Size API to return table size per replica#10812mayankshriv merged 4 commits intoapache:masterfrom
Conversation
2. Code Clean up & removing ununsed param "detailed" in REST API
Codecov Report
@@ Coverage Diff @@
## master #10812 +/- ##
=============================================
- Coverage 70.26% 24.01% -46.26%
+ Complexity 6517 49 -6468
=============================================
Files 2164 2154 -10
Lines 116312 116255 -57
Branches 17590 17606 +16
=============================================
- Hits 81731 27918 -53813
- Misses 28869 85392 +56523
+ Partials 5712 2945 -2767
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1633 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
mayankshriv
left a comment
There was a problem hiding this comment.
Not required in the PR, but I feel, enhancing the api to also report the compressed size in deepstore would be quite useful.
| } | ||
| } | ||
|
|
||
| // Set the top level sizes to DEFAULT_SIZE_WHEN_MISSING_OR_ERROR when all segments are error |
There was a problem hiding this comment.
could you please help me understand the reason for this decision?
There was a problem hiding this comment.
To explicitly say in response that segments are in error ( and its not 0 sized ). However, the much more correct way for the caller to identify is to look into "missingSegment" property.
With the existing implementation, this would return as "-2" which kind of diverges from the default error value we use , so I move this section at the end
There was a problem hiding this comment.
Can you please help me understand why the condition is "when all segments are error". What is the major difference between "when all segments are error" and "when some segments are in error state"?
Say that a table has 1k segments, the current logic set the top level size to -1 when all 1k segments are in error state, but will not set the level size to -1 when only 1 segment is in good state. What is the major difference here?
There was a problem hiding this comment.
Assume the case for hybrid Table with all the segments missing .
The current implementation would have value as "-2" ( This is due to both realtime & offline block adding "-1" ). This is what I have tried to explain in my previous comment
| } | ||
|
|
||
| // Set the top level sizes to DEFAULT_SIZE_WHEN_MISSING_OR_ERROR when all segments are error | ||
| if ((hasRealtimeTableConfig && hasOfflineTableConfig && isMissingAllRealtimeSegments && isMissingAllOfflineSegments) |
There was a problem hiding this comment.
- we can remove "(hasRealtimeTableConfig && hasOfflineTableConfig && isMissingAllRealtimeSegments && isMissingAllOfflineSegments)" if we want to keep this logic
- could you please help me understand why we want to set the the top level sizes to DEFAULT_SIZE_WHEN_MISSING_OR_ERROR when either the offline or realtime table has problems?
There was a problem hiding this comment.
- We would require this condition , else it would fail for hybrid table case when All realtime segments are good, while all offline segments are missing & vice versa case as well .
- Explained the reason here : https://github.com/apache/pinot/pull/10812/files#r1213281580 . It would be set "-1" when all existing segments of the table are missing
There was a problem hiding this comment.
- Thanks for the Truth Table way. It helped to catch my error in a boolean algebraic fashion. Fixed it
- replied above
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableSizeReader.java
Outdated
Show resolved
Hide resolved
Sure @mayankshriv . Yes , this would be useful. Created a issue for this #10828 |
zhtaoxiang
left a comment
There was a problem hiding this comment.
It looks good to me now. thanks for enhancing it.
|
This PR breaks the |

PR contains following changes :
Related Discussion : https://apache-pinot.slack.com/archives/C011C9JHN7R/p1684396114803769
cc : @mayankshriv