Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SegmentDeletionManager assumes segment is directly under table prefix in deep store #14122

Open
dd-willgan opened this issue Sep 30, 2024 · 3 comments

Comments

@dd-willgan
Copy link
Contributor

Hi Pinot team, recently my company came across an issue where we realized that expired segments were not being deleted from the deep store. The reason for this we realized is that Pinot assumes the data is directly under the deep store directory for the given table here but in our case the segments were actually uploaded to subdirectories within the table directory e.g. <dataDir>/<rawTableName>/<partition>/<segment>. Is it possible to try deleting the URI from the segment ZK metadata as a fallback?

@Jackie-Jiang
Copy link
Contributor

Trying to get more context here. Do you use metadata push to upload segments?
I think the underlying implication here is that if the data is purposely put in a separate directory, pinot doesn't delete them in case user wants to keep them around. But I guess we may introduce a config for pinot to not delete the file in deep store (by default false)

@dd-willgan
Copy link
Contributor Author

Hey @Jackie-Jiang yes SegmentMetadataPushJobRunner. I see, yes I would be okay with adding a flag to control this behavior, maybe something like controller.segment.delete.useStoredUri

@Jackie-Jiang
Copy link
Contributor

cc @swaminathanmanish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants