Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estuary Cleaner #420

Open
alvin-reyes opened this issue Aug 27, 2022 · 0 comments
Open

Estuary Cleaner #420

alvin-reyes opened this issue Aug 27, 2022 · 0 comments
Assignees
Labels
infra-devops A task related to infrastructure and/or devops New Feature Issues that we will work on with people or ourselves

Comments

@alvin-reyes
Copy link
Contributor

alvin-reyes commented Aug 27, 2022

Request

There are a lot of stale records on Estuary database and we need to clean this up somehow.

Proposal

Create a script the cleans up estuary database with the following criteria:

1 - get all ContentID with Unwalkable CID
2 - get all CID that cannot be provided
3 - get all CID with content deadline exceeded
4 - get all blockstore: block not found CIDs
5 - create a failure_log table to log all failed CIDs. We can use this information to investigate each CID.

For each of the criteria above, log the records content on a file and delete.

The script should run on a periodic basis (weekly) to ensure that any stale data is purged to prevent any unnecessary re-processing.

@alvin-reyes alvin-reyes added the New Feature Issues that we will work on with people or ourselves label Aug 27, 2022
@alvin-reyes alvin-reyes self-assigned this Aug 27, 2022
@alvin-reyes alvin-reyes added the infra-devops A task related to infrastructure and/or devops label Aug 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra-devops A task related to infrastructure and/or devops New Feature Issues that we will work on with people or ourselves
Projects
None yet
Development

No branches or pull requests

1 participant