Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing empty segments with no offset advancing #12703

Open
jadami10 opened this issue Mar 22, 2024 · 4 comments
Open

Allowing empty segments with no offset advancing #12703

jadami10 opened this issue Mar 22, 2024 · 4 comments

Comments

@jadami10
Copy link
Contributor

jadami10 commented Mar 22, 2024

This is a followup from #8929, but in the case of 0 data being consumed. We've since found a poor interaction between Pinot and our s3 lifecycling.

  • we have a few partitions that have 0 data all the time (not filtered out, literally 0 events)
  • Pinot keeps that consuming segment open indefinitely (in this case we noticed it had been 1 year)
  • Pinot also keeps the last completed segment for each partition (in this case, it's from 2023)
  • We wanted to use s3 lifecycling to delete all data < 30 days old (our tables had 10 day retention) rather than just rely on Pinot's mechanisms, but this sent segments into error state since they had been around for 1 year

Is there any reason we can't seal a segment where the offset hasn't advanced? In this case, we would have had N segments for this partition all with 0 records and the same start/end offset.

cc @Jackie-Jiang @priyen-stripe

@Jackie-Jiang
Copy link
Contributor

I think we don't seal them right now because we don't support empty segment before. Since we can support empty segment now, we should be able to seal them. We want to revisit the timestamp used for empty segment (using current time should work) so that retention manager can remove them properly.

@tibrewalpratik17
Copy link
Contributor

tibrewalpratik17 commented Jun 7, 2024

Hey @Jackie-Jiang any updates on this?

Is this as straight-forward as removing these lines?

if (!_hasMessagesFetched) {
_segmentLogger.info("No events came in, extending time by {} hours", TIME_EXTENSION_ON_EMPTY_SEGMENT_HOURS);
_consumeEndTime += TimeUnit.HOURS.toMillis(TIME_EXTENSION_ON_EMPTY_SEGMENT_HOURS);
return false;
}

Saw this in one of our tables today. The table existed for more than a year and we enabled snapshot on it recently. Now since the last consuming segment (which was like months old) never received any message, the snapshot flow didn't run and that entire partition was knocked out of upsert-compaction flow. [Update] I have tried to fix this issue for partial-upsert tables here - #13285 (comment)

@Jackie-Jiang
Copy link
Contributor

Is this as straight-forward as removing these lines?

I think so.

The drawback is that we might end up with a lot of segments that are all empty, but I guess we can make retention manager keep cleaning up empty segments (it might already do) as long as they are not the last completed segment.

@jadami10
Copy link
Contributor Author

One other issue we've seen (QA only so far) from our multi-kafka cluster setup

  • table is created and creates as segment for partition A on cluster 1
  • partition A has no data
  • cluster 1 is later removed for this topic, so Pinot should no longer get this partition
  • Pinot keeps trying to consume because the segment is open indefinitely and spews endless errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants