-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk space low on jenkins controller's jobs
directory
#3285
Comments
GetNode seems to call the buildenv/jenkins/getDependency script (for which we have a separate job). I have modified it to only keep 10 days of history, but would suspect it could be completely deleted, defer to Sophia's response. |
@sxa definitely not needed at this time, it can be deleted. |
Space on this file system expired during the January release cycle impacting the mac builds. Ref: |
re: #3285 (comment) - may also explain issues seen during release for TRSS being slow to get and display info from this Jenkins instance |
There are 651 individual jobs which have
Will do a breakdown of how much space in in use by each of them so they can be reviewed appropriately. As an example the first one on the list was https://ci.adoptium.net/job/Test_openjdk11_j9_sanity.openjdk_aarch64_linux_xl/35/ from 2020 which I feel will be one that is no longer required (It's taking up 3Gb on its own) The follow shows how much space the pipeline jobs take up when being retained. Note that we have a number of these being retained on the
This only includes the top level pipelines, not the individual build jobs which are |
https://ci.adoptium.net/job/dependency_pipeline/ does not have Discard old builds set, so all 1072 builds is storing artifacts. @sophia-guo can we just keep the Last successful build ? |
Id suggest we delete the out of support pipeline builds: |
i’d also suggest we could delete the old build jobs from here: https://ci.adoptium.net/job/build-scripts/job/jobs/ |
Removing the following which appear to be hangovers from 2021 and appear to be "workspace" style directories of transient information which doesn't need to be there:
I'm creating a backup to keep for a week before deleting them, but this is about 80Gb in total. |
Running this script shows disk space usage increased by c.100Gb in a single day, I suspect this is a good example of our burn rate
|
To be clear, the disk usage didn't really increase by 100Gb/day - that's the size of new things that have been generated. The jenkins "discard old builds" option means that older ones will be getting deleted alongside these new ones. |
👍🏻 on this - We've got an issue with the eclipse-mirror job at the moment and once that is resolved and I've mirrored the latest 18 and 19 releases I'll look at clearing those out from jenkins. |
I've pushed out the jdk-18.0.2.1+1 and jdk-19.02+7 builds to the Eclipse mirror, so we have those retained properly. Deep dive into openjdk18-pipeline which using 29Gb in total: openjdk18-pipeline (29Gb)
openjdk19-pipeline (32Gb)
openjdk20-pipeline (19Gb)
openjdk17-pipeline (42Gb)
Earlier pipelines exist, but not with artifacts openjdk11-pipeline (206Gb)Generated from: du -sk build-scripts/jobs/openjdk11-pipeline/builds/* | awk '$1>1000000' | while read A B; do NUM=`basename $B`; echo $NUM \| $A \| `grep "" build-scripts/jobs/openjdk11-pipeline/builds/$NUM/build.xml | sed -e "s/^[^>]*>//" -e "s/<.*$//"`; done | sort -n
Earlier pipelines exist, but not with artifacts openjdk8-pipeline (122Gb)
release-openjdk8-pipeline (36Gb)
release-openjdk11-pipeline (80Gb)
release-openjdk17-pipeline (77Gb)
release-openjdk19-pipeline (5.6Gb)
release-openjdk20-pipeline (26Gb)
release-openjdk21-pipeline (19Gb)
|
openjdk8-pipeline cleared. I have only removed the binary artifacts (msi, zip, tar.gz) with the following command: Notes:
The amount of space used for This is the log of deleted artifacts (includes the original and For subsequent runs, this would be the correct command to use, adjusted to replace |
Likely as a result of the problems with low disk space on this file system recently some files are truncated e.g. |
Pruned the following:
Those have given us 396Gb on the file system.
That means we're now up to 550Gb. I'm going to stop now although none of the release-openjdkXX-pipeline ones have been cleared. Space in use by each pipeline is now:
I've moved some of my openjdk21-pipeline retained jobs out of the way which has brought it up to 584Gb so that's only taking the space for the recently retained builds now, plus build 216 which was an early access s390x one - now 17Gb in total so consistent with the others. We have 589Gb free now. I'm going to leave the release ones just now as I feel it's a bit safer to leave them to manually "un-keep" those. Noting that yesterday we also cleared out (I think) over 1000 |
Here are the current biggest space hog locations after performing the above cleanup:
While there could be more cleanup work here (maybe some of the individual platform builds have things retained unnecessarily) I'll consider this complete for now. I'll raise a separate issue for dependency_pipeline (Noting that we have an issue for sorting out aspects of build.getDependency although the space usage of that seems to have been mitigated) |
Disk was running at around 99% space used yesterday (It's a 2Tb file system) as alerted by Nagios
Top 20 disk hogs are as follows:
I'm not sure what GetNode is for but it seems to have artifacts for jtreg etc. similar to dependency_pipeline It also hasn't run successfully for a while - last good run seems to have been 18 months ago. @sophia-guo Since GetNode seems to have been most recently edited by you do you know what's it's for, is it still required, and does it need to keep so many artifacts?
@steelhead31 Is your Linux packaging job still needed?
@andrew-m-leonard I think we need to consider what to do with all the release pipelines. At the moment we keep them all locked, but that's going to continually increase the disk space requirements here. We've now switched to the release-openjdkXX-pipelines jobs but some of the older releases from before the
release-
jobs came about are "locked" jobs in theopenjdkXX-pipeline
ones. We should probably also look at whether there are things unnecessarily locked inbuild-scripts/jobs
The openjdk_build_docker_multiarch is keeping the container images for old releases in the adoptopenjdk dockerhub account up to date, but it's keeping the logs indefinitely at about 40Mb/day and it's at close to 1000 runs hence the 36Gb of space in use. We should likely stop it retaining more than, say, a month's worth of logs to reclaim that space.
The text was updated successfully, but these errors were encountered: