Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[blobstore] Filter out expired blobs from query #692

Merged
merged 2 commits into from
Oct 2, 2024

Conversation

ian-shim
Copy link
Contributor

@ian-shim ian-shim commented Aug 9, 2024

Why are these changes needed?

Updates GetBlobMetadataByStatusXXX methods so that it doesn't fetch any expired blobs (determined by filtering on RequestedAt > Now - TTL).

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@ian-shim ian-shim marked this pull request as ready for review August 9, 2024 05:47
@ian-shim
Copy link
Contributor Author

This works as long as we don't change the TTL, but this would break if TTL changes - which is defined onchain.
I think the better solution is to either create an index based on Expiry attribute or delete the expired blobs. Any thoughts @dmanc ?

@dmanc
Copy link
Contributor

dmanc commented Aug 13, 2024

This works as long as we don't change the TTL, but this would break if TTL changes - which is defined onchain. I think the better solution is to either create an index based on Expiry attribute or delete the expired blobs. Any thoughts @dmanc ?

Why would we change the TTL? Also I don't like anything to do with assuming the metadata store doesn't have expired blobs. Blobs usually aren't deleted immediately when they get expired on DynamoDB for example, it goes through them at some rate.

"DynamoDB automatically deletes expired items within a few days of their expiration time, without consuming write throughput."

@ian-shim
Copy link
Contributor Author

Why would we change the TTL?

Unclear for now, but we may want to extend the TTL in the future. It's an assumption we have to keep in mind, so I'd rather handle it the right way.

Also I don't like anything to do with assuming the metadata store doesn't have expired blobs.

I think thats fair. Should we create an index on Expiry then?

@dmanc
Copy link
Contributor

dmanc commented Aug 13, 2024

Why would we change the TTL?

Unclear for now, but we may want to extend the TTL in the future. It's an assumption we have to keep in mind, so I'd rather handle it the right way.

Also I don't like anything to do with assuming the metadata store doesn't have expired blobs.

I think thats fair. Should we create an index on Expiry then?

Yeah let's create an index

Why would we change the TTL?

Unclear for now, but we may want to extend the TTL in the future. It's an assumption we have to keep in mind, so I'd rather handle it the right way.

Also I don't like anything to do with assuming the metadata store doesn't have expired blobs.

I think thats fair. Should we create an index on Expiry then?

Yeah I can add it tomorrow.

@dmanc
Copy link
Contributor

dmanc commented Sep 9, 2024

Deployed the index, should be unblocked now

Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems an issue where Dispersing blobs that are expired will remain stuck in Dispersing after this change

@@ -117,10 +124,14 @@ func (s *BlobMetadataStore) GetBlobMetadataByStatus(ctx context.Context, status
// Because this function scans the entire index, it should only be used for status with a limited number of items.
// It should only be used to filter "Processing" status. To support other status, a streaming version should be implemented.
func (s *BlobMetadataStore) GetBlobMetadataByStatusCount(ctx context.Context, status disperser.BlobStatus) (int32, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function name is a bit odd. It should probably be GetBlobCountByStatus or GetBlobMetadatCountByStatus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah updated to GetBlobMetadatCountByStatus

":status": &types.AttributeValueMemberN{
Value: strconv.Itoa(int(status)),
},
":requestedAt": &types.AttributeValueMemberN{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

liveBlobBoundary or something? More readable than two requestedAts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have an index on Expiry. Updated it to filter by expiry field instead

@ian-shim ian-shim force-pushed the dont-recover-expired-blobs branch from 655d703 to 99d3d85 Compare September 10, 2024 16:08
@ian-shim
Copy link
Contributor Author

There seems an issue where Dispersing blobs that are expired will remain stuck in Dispersing after this change

That's true. Is it a concern?

@ian-shim ian-shim force-pushed the dont-recover-expired-blobs branch 6 times, most recently from f29ba48 to 0622f30 Compare September 11, 2024 17:05
@jianoaix
Copy link
Contributor

There seems an issue where Dispersing blobs that are expired will remain stuck in Dispersing after this change

That's true. Is it a concern?

Not sure if there is unexpected access (and use) of these "zombie" blobs before they get removed, if so that can cause issue

@ian-shim ian-shim force-pushed the dont-recover-expired-blobs branch 2 times, most recently from e6d9a2e to 4bc1e4b Compare September 12, 2024 16:45
@ian-shim ian-shim force-pushed the dont-recover-expired-blobs branch from 4bc1e4b to d45c446 Compare September 30, 2024 05:48
@ian-shim
Copy link
Contributor Author

There seems an issue where Dispersing blobs that are expired will remain stuck in Dispersing after this change

That's true. Is it a concern?

Not sure if there is unexpected access (and use) of these "zombie" blobs before they get removed, if so that can cause issue

Added a check on Expiry and moves to Failed if it has passed. Wdyt?

@jianoaix
Copy link
Contributor

There seems an issue where Dispersing blobs that are expired will remain stuck in Dispersing after this change

That's true. Is it a concern?

Not sure if there is unexpected access (and use) of these "zombie" blobs before they get removed, if so that can cause issue

Added a check on Expiry and moves to Failed if it has passed. Wdyt?

Sg, thanks

@ian-shim ian-shim force-pushed the dont-recover-expired-blobs branch from d45c446 to 3e289ac Compare October 1, 2024 19:50
@ian-shim ian-shim merged commit 35e0b6b into Layr-Labs:master Oct 2, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants