Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection policy for OS releases #99

Open
bgilbert opened this issue Dec 14, 2018 · 15 comments
Open

Garbage collection policy for OS releases #99

bgilbert opened this issue Dec 14, 2018 · 15 comments
Assignees

Comments

@bgilbert
Copy link
Contributor

bgilbert commented Dec 14, 2018

Each Fedora CoreOS release will produce several artifacts:

  • An ostree
  • Image artifacts for each platform
  • Cloud images, on those platforms where we upload image artifacts ourselves

It will also depend on artifacts produced elsewhere:

  • Binary packages
  • debuginfo packages (not included in an image, but needed for debugging)

We should have a clear garbage collection policy for each of these.

  1. Will koji delete packages included in previous FCOS releases?
  2. How long are ostrees retained for old releases?
  3. How long are old image artifacts retained?
  4. How long are old cloud images retained?

The answers may have consequences for release metadata (#98). I'm in favor of retaining everything forever, but there's obviously a storage cost.

Container Linux

The Container Linux GC policy is slightly inconsistent:

  • Binpkgs, image artifacts, and update payloads live forever
  • AWS and Azure images live forever
  • GCE images are eventually garbage-collected
@bgilbert bgilbert added meeting topics for meetings kind/design labels Dec 14, 2018
@jlebon
Copy link
Member

jlebon commented Dec 14, 2018

@dustymabe Mind adding the Fedora Atomic Host status quo re. current GC policy for completeness? For OSTrees specifically, AFAIK currently they're all kept.

@bgilbert bgilbert removed the meeting topics for meetings label Dec 19, 2018
@dustymabe
Copy link
Member

dustymabe commented Dec 19, 2018

@dustymabe Mind adding the Fedora Atomic Host status quo re. current GC policy for completeness? For OSTrees specifically, AFAIK currently they're all kept.

In general right now we do keep all OSTrees, I worked a while ago on a process that would GC dev branches in our OSTree repo (which I think we should still do) but I haven't had time to revive that PR recently.

@dustymabe
Copy link
Member

Discussed in the meeting today. We agree we should define a policy but we'd like to sync with Fedora releng before we finalize anything. Here is the current proposal we've come up with in the meeting:

  1. we will not make an effort to keep "development" artifacts for any extended period of time
  2. we will keep non "development" artifacts for X years (or Y months) - time based, length of time TBD
  3. OPTIONAL we may choose to have an artifacts server where we keep older relevant artifacts (this would be somewhere that doesn't eat up space in Fedora Infra/Releng)

We will engage with Fedora releng to see what the current policy is for Fedora Server/Workstation/Cloud and see how this differs from our proposal above.

@mohanboddu
Copy link

mohanboddu commented Dec 19, 2018

  1. we will not make an effort to keep "development" artifacts for any extended period of time

The current policy on removing nightly composes is 2 weeks.

  1. we will keep non "development" artifacts for X years (or Y months) - time based, length of time TBD

The current policy on this is until the release gets EOL'd which is about 13 months.

  1. OPTIONAL we may choose to have an artifacts server where we keep older relevant artifacts (this would be somewhere that doesn't eat up space in Fedora Infra/Releng)

Currently we move the EOL'd releases to /pub/archive/ which is mirrored by not many mirrors.

NOTE: This differs for ostrees, they are kept forever and are not mirrored.

@bgilbert
Copy link
Contributor Author

bgilbert commented Jan 3, 2019

@mohanboddu Thanks for the info. Does the above apply to both release-day artifacts and update RPMs, or are update RPMs currently garbage-collected sooner (e.g. when they're superseded by newer updates)?

FCOS releases will include update RPMs from Fedora repos (and will want access to others that aren't in the compose, e.g. debuginfo), and we should decide whether those RPMs will be preserved alongside the releases.

@bgilbert
Copy link
Contributor Author

The Container Linux GC policy is slightly inconsistent:

  • GCE images are eventually garbage-collected

According to @crawford, GCE previously imposed a limit of 100 images per project. He thinks that is no longer the case, and the quotas and limits page makes no mention of it.

@jlebon jlebon added the meeting topics for meetings label Mar 4, 2020
@jlebon
Copy link
Member

jlebon commented Mar 4, 2020

I think as a first approximation, we can at least settle on a GC policy for non-production streams. Let's say... 60 days?

The tricky part will be untagging things from the pool. I think we'll need to have some code that scans all the lockfiles we still care about and prunes away everything else.

@jlebon
Copy link
Member

jlebon commented Mar 12, 2020

We discussed this in the community meeting yesterday. We are going to start with pruning non-production builds older than 60 days. Things to prune: cosa build artifacts (though we will keep meta.json), AMIs, OSTrees (this will be handled by the OSTree pruner), coreos-pool packages.

@dustymabe dustymabe removed the meeting topics for meetings label Mar 18, 2020
@jlebon
Copy link
Member

jlebon commented May 28, 2020

Things to prune: ..., coreos-pool packages.

See coreos/fedora-coreos-config#432 which describes an upper bound. I think our policy should be more aggressive than that though.

@dustymabe
Copy link
Member

dustymabe commented Mar 8, 2022

We need to create a new tool to prune these things for non-production builds:

  • cosa build artifacts (s3 buckets)
  • AWS AMIs
  • GCP Images
  • FCOS containers in quay.io
  • anything else?

For the OSTree commits in the OSTree repo the fedora-ostree-pruner should be able to take care of those so we don't need a tool to handle that piece.

For the RPMs in the coreos-pool koji tag we GC those once every 6 months or so anyway as part of our Fedora major version rebase so we don't really need any special handling for that right now.

@dustymabe dustymabe added the jira for syncing to jira label Jun 6, 2022
@gursewak1997
Copy link
Member

So far, we have a pruning script which is currently used by RHCOS.
I propose to modify that script and run it accordingly(with relevant args) as a new job in fedora-coreos-pipeline jobs which would run, let's say once a week. The script would prune the relevant non-production build's artifacts, AMIs, and GCP images added/built before x number of days/months.

cc: @jlebon @vrutkovs

@jlebon
Copy link
Member

jlebon commented Jul 22, 2022

So far, we have a pruning script which is currently used by RHCOS. I propose to modify that script and run it accordingly(with relevant args) as a new job in fedora-coreos-pipeline jobs which would run, let's say once a week. The script would prune the relevant non-production build's artifacts, AMIs, and GCP images added/built before x number of days/months.

Heh, I actually forgot we had cosa remote-prune. Yes, that sounds great to me at a high-level. When we roll it out, we can start with it just in dry run mode to sanity-check what it does.

@dustymabe
Copy link
Member

@jlebon
Copy link
Member

jlebon commented Sep 21, 2023

On the cosa side, there's an issue tracking adding support for more cloud pruning in cosa remote-prune: coreos/coreos-assembler#889. I think if we invest in this (we should), we need to get in the habit of adding prune support at the same time we add support for uploading to a cloud in cosa.

@dustymabe dustymabe added the meeting topics for meetings label Feb 14, 2024
@jlebon jlebon removed the meeting topics for meetings label Apr 10, 2024
@gursewak1997
Copy link
Member

We’ve implemented garbage collection for cloud uploads (AWS AMIs, Snapshots, and GCP images) and S3 images, with the ability to keep required images and also, pruning entire builds and their related resources from S3 directories. Currently, pruning is done manually via the GC job.
The plan is to automate this as part of the release process once we’re confident it runs smoothly.
PR for automation: coreos/fedora-coreos-pipeline#1019.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants