Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix setTtl workflow and ttlbucket bugs #16933

Merged
merged 17 commits into from
Mar 13, 2023
Merged

Conversation

lucyge2022
Copy link
Contributor

What changes are proposed in this pull request?

1)setttl cmd on a big directory unnecessary setattr recursively, while dir ttl already supercedes children ttl. there's absolutely no reason to save ttl on children node individually.
2)ttlbucket unnecessarily saves Inode obj on heap unboundedly, where it could and should only save inodeid and load inode from inodestore to double check expiry during inodettlchecker processing.
3)fix race condition on ttlbucketlists from ttlchecker and insertion from foreground.
4)fix case where any expiration of inodes failed during ttlchecker it will never be retried again.

Why are the changes needed?

Reduce heavy GC when setttl on a huge folder / bug fixes as indicated above for ttl

Does this PR introduce any user facing changes?

yes. So now setTtl command on a dir will not set ttl attributes for all its children.

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word of title ("Ttl") is not an imperative verb. Please use one of the valid words

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

@lucyge2022 lucyge2022 changed the title Ttl bugfix Fix setTtl workflow and ttlbucket bugs Feb 23, 2023
@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: PASS

All checks passed!

and inode should load on inodettlchecker processing and be
double checked on expiry.
2. setttl cmd should not set recursively on a directory, causing
massive unnecessary setattr on children while dir ttl already
supercede children ttl.
3. any inode failed during inodettlchecker process should add back
to ttlbuckets for retry
4. Master.TTLInodes metrics can just be aggregated at time of gauging
instead of atomic counter as it is not accurate during concurrent
modification to ttlbuckets.
5. Fix race condition in TtlBucketList.insert(inode) where it might
add to a bucket already got removed by ttlchecker (ALLUXIO-2821)
6. ttlbuckets should be polled first in each ttlchecker round to
avoid concurrent modification while processing
7. add test cases for respecting dir ttl over children ttl.
Copy link
Contributor

@jiacheliu3 jiacheliu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added two side notes. I actually didn;t look into the details, but if you think my review is relevant then I can take a much deeper look in a bit. Thanks!

@lucyge2022
Copy link
Contributor Author

added two side notes. I actually didn;t look into the details, but if you think my review is relevant then I can take a much deeper look in a bit. Thanks!

absolutely, feel free to take a deeper look! As I would really like to be reminded of anything regression from other components in master / in general.

Copy link
Contributor

@jiacheliu3 jiacheliu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bunch of comments, PTAL. Thanks! Most of them are minor except a few.

Copy link
Contributor

@jiacheliu3 jiacheliu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for the work!

@yuzhu
Copy link
Contributor

yuzhu commented Mar 13, 2023

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit 7313853 into Alluxio:master Mar 13, 2023
@Xenorith Xenorith added the area-master Alluxio master label Mar 22, 2023
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request May 17, 2023
### What changes are proposed in this pull request?

1)setttl cmd on a big directory unnecessary setattr recursively, while
dir ttl already supercedes children ttl. there's absolutely no reason to
save ttl on children node individually.
2)ttlbucket unnecessarily saves Inode obj on heap unboundedly, where it
could and should only save inodeid and load inode from inodestore to
double check expiry during inodettlchecker processing.
3)fix race condition on ttlbucketlists from ttlchecker and insertion
from foreground.
4)fix case where any expiration of inodes failed during ttlchecker it
will never be retried again.

### Why are the changes needed?

Reduce heavy GC when setttl on a huge folder / bug fixes as indicated
above for ttl

### Does this PR introduce any user facing changes?

yes. So now setTtl command on a dir will not set ttl attributes for all
its children.

pr-link: Alluxio#16933
change-id: cid-36ec524b2178220886cccb7a82de119bdcc8203f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Change Changes covering public API area-master Alluxio master
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants