-
Notifications
You must be signed in to change notification settings - Fork 818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add scripts for backups #377
Conversation
After this gets reviewed + merged, I plan on adding Prow jobs in k8s/test-infra repo to make use of these scripts at regular intervals (the more frequent, the better). I could also port the "record_consistency.sh" logic into the promoter itself it seems like a conflict of interest, because the whole point of this PR is to add secondary checks to make sure the promoter is behaving properly. |
@thockin This is much simpler than the earlier PR (1 script instead of 3). PTAL |
This is blocked by #376 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be under infra/gcp ?
We have a turtle-at-the-bottom problem. We shift the focus to gcrane now - how can we build trust in that? I'd suggest we lock to a specific SHA256.
infra/backup_tools/README.md
Outdated
# Backup tools | ||
|
||
The tools in this folder are designed to backup the | ||
{asia,eu,us}.gcr.io/k8s-artifacts-prod GCRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all 3 are supposed to be identical, do we need backups of all 3? Or is it just easier that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only did it for symmetry (knee-jerk reaction).
Philosophically if I choose 1 region to back up out of the 3, then that 1 region becomes special and I don't like this.
infra/backup_tools/backup.sh
Outdated
|
||
# USAGE NOTES | ||
# | ||
# Backs up prod registries. This is a thin orchestrator as all the heavy lifting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment implies copy_with_date.sh
and record_consistency.sh
are separate files, but no such file exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, will fix.
infra/backup_tools/backup.sh
Outdated
|
||
# Copy each region to its backup. | ||
for prod_repo in "${prod_repos[@]}"; do | ||
copy_with_date "${prod_repo}" "${prod_repo}-backup" "${sa_key_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB I changed "-backup" to "-bak" in latest push of the ensure script (because we overflowed allowable project names :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
infra/backup_tools/backup.sh
Outdated
docker run \ | ||
-v "${sa_key_path}":/auth.json \ | ||
--env GOOGLE_APPLICATION_CREDENTIALS=/auth.json \ | ||
gcr.io/go-containerregistry/gcrane \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a particular tag? Assuming :latest is kind of scary...
What protects this backup tool from being compromised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a particular tag? Assuming :latest is kind of scary...
Yes, I should have referenced a sha256 digest here. Will change to not use :latest
.
What protects this backup tool from being compromised?
Good question. For now I think the sha256 is the only guarantee. If we need to update we would have to run some e2e test that ensures that the new version doesn't change behavior. Short of that, at least manual testing of some sort.
Also, if pinning to a SHA, comment why :) |
c4ab926
to
fd0bb3e
Compare
I removed the unnecessary call to gcloud (activating creds) because they are only used inside the gcrane container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/lgtm
/approve
# folders more easily traversable from a human perspective. | ||
timestamp="$(date -u +"%Y/%m/%d/%H")" | ||
|
||
source_gcr_repo="${1}" # "us.gcr.io/k8s-artifacts-prod" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should probably check $#
and abort if the args are the wrong number, ok for followup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This is a rough cut of some tooling for basic backups from prod to a backup. The `backup.sh` script will run from a periodic Prow job every hour. This script expects the "k8s-artifacts-prod-backup" GCR to exist, as well as a service account that has write access to this backup location.
/assign @cblecker |
|
||
# Perform backup by copying all images recursively over. | ||
docker run \ | ||
-v "${sa_key_path}":/auth.json \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it be a good idea to use buildkit here @listx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't heard of that. Care to elaborate? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ignore me. all looks good.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cblecker, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is a rough cut of some tooling for basic backups from prod to a
backup.
/assign @thockin @justinsb