Skip to content

Allow moveToFinalLocation in METADATA push based on config#8823

Merged
npawar merged 7 commits intoapache:masterfrom
npawar:fix_metadata_2
Jun 15, 2022
Merged

Allow moveToFinalLocation in METADATA push based on config#8823
npawar merged 7 commits intoapache:masterfrom
npawar:fix_metadata_2

Conversation

@npawar
Copy link
Contributor

@npawar npawar commented Jun 2, 2022

METADATA push didn't allow the option of moveSegmentToFinalLocation. This meant that if someone had generated segments in a location that was not the deep store, there was absolutely no way to move those segments into deep store without manual scripting.
Chatting with @xiangfu0 , I understood that this was done on purpose, in order to keep METADATA push very light weight and not involve copying from outputDir to dataDir in the push. But this is a very valid scenario where users would want to generate segments in a location other than deep store, and still want to use METADATA push (reasons could be multiple, as listed in #7328).

While there have been more elaborate solutions proposed in the issue above (like periodic background tasks), this PR simply enables one to allow METADATA push to do moveToFinalLocation. And while this would increase the data push time, it is a better solution than asking users to switch to SegmentTarPush or SegmentUriPush, especially if the issue was disparate permissions between outputDir ad dataDir (one of the reasons mentioned in the issue).

Backward compatible behavior or this being false by default is maintained.

Added an integration test.

@npawar npawar added release-notes Referenced by PRs that need attention when compiling the next release notes Configuration Config changes (addition/deletion/change in behavior) labels Jun 2, 2022
@npawar npawar marked this pull request as ready for review June 3, 2022 02:35
@codecov-commenter
Copy link

codecov-commenter commented Jun 3, 2022

Codecov Report

Merging #8823 (106c0e7) into master (f3bde9f) will decrease coverage by 3.17%.
The diff coverage is 0.00%.

@@             Coverage Diff              @@
##             master    #8823      +/-   ##
============================================
- Coverage     69.60%   66.43%   -3.18%     
+ Complexity     4997     4669     -328     
============================================
  Files          1806     1355     -451     
  Lines         94202    68423   -25779     
  Branches      14050    10677    -3373     
============================================
- Hits          65571    45456   -20115     
+ Misses        24072    19738    -4334     
+ Partials       4559     3229    -1330     
Flag Coverage Δ
integration1 ?
integration2 ?
unittests1 66.43% <0.00%> (+0.05%) ⬆️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...e/pinot/common/utils/FileUploadDownloadClient.java 18.18% <ø> (-40.91%) ⬇️
...he/pinot/segment/local/utils/SegmentPushUtils.java 12.50% <0.00%> (-0.21%) ⬇️
...he/pinot/spi/ingestion/batch/spec/PushJobSpec.java 52.00% <0.00%> (-7.10%) ⬇️
...va/org/apache/pinot/core/routing/RoutingTable.java 0.00% <0.00%> (-100.00%) ⬇️
...va/org/apache/pinot/common/config/NettyConfig.java 0.00% <0.00%> (-100.00%) ⬇️
...a/org/apache/pinot/common/metrics/MinionMeter.java 0.00% <0.00%> (-100.00%) ⬇️
...g/apache/pinot/common/metrics/ControllerMeter.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pinot/common/metrics/BrokerQueryPhase.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pinot/common/metrics/MinionQueryPhase.java 0.00% <0.00%> (-100.00%) ⬇️
...apache/pinot/common/helix/ExtraInstanceConfig.java 0.00% <0.00%> (-100.00%) ⬇️
... and 712 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f3bde9f...106c0e7. Read the comment docs.

@npawar npawar requested review from Jackie-Jiang and xiangfu0 June 4, 2022 23:24
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only non-blocking comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest not introducing upload type into this module. When it is configured to copy the segment to the deep store, we can check if sourceDownloadURIStr is provided, and copy accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest not introducing upload type into this module. When it is configured to copy the segment to the deep store, we can check if sourceDownloadURIStr is provided, and copy accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed all other comments. For this one, I kinda prefer keeping the uploadType, so it is very clear when reading. Don't want to rely on sourceDownloadURIStr as it can be non-null even in URI push case, but we want to use segment file in that case.

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. i only have some minor questions. please kindly take a look

@npawar npawar merged commit 378bdec into apache:master Jun 15, 2022
@npawar npawar deleted the fix_metadata_2 branch June 15, 2022 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Configuration Config changes (addition/deletion/change in behavior) release-notes Referenced by PRs that need attention when compiling the next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants