Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(deploy): allow the container to raise in MIGs #6893

Merged
merged 3 commits into from
Jun 9, 2023
Merged

Conversation

gustavovalverde
Copy link
Member

@gustavovalverde gustavovalverde commented Jun 9, 2023

Motivation

The Managed Instance Groups in GCP were not raising the container as those were expecting the device name and the correct RW/RO mode to match between the created disk and the disk we're mounting in the container.

This also fixes monitoring, and uses the same caching directory as in CI, as the Container Optimized Image does not allow mounting in non-standard linux paths for security reasons.

Fixes: #5644
Fixes: #6861

Specifications

See: https://cloud.google.com/sdk/gcloud/reference/compute/instance-templates/create-with-container#--container-mount-disk

Solution

  • Add the missing --boot-disk-size 300GB to the MIGs template, just as we have on CI and Single instance deploy
  • Allow user output and container stdin and tty to standardize with other approaches
  • Enable Google Monitoring for single and long-live instances (MIGs)
  • Use ZEBRA_CACHED_STATE_DIR with /var/cache/zebrad-cache as we do on CI, as the cos-cloud doesn't behave correctly with non-standard paths.

Review

Anyone from DevOps

Reviewer Checklist

  • Will the PR name make sense to users?
    • Does it need extra CHANGELOG info? (new features, breaking changes, large changes)
  • Are the PR labels correct?
  • Does the code do what the ticket and PR says?
    • Does it change concurrent code, unsafe code, or consensus rules?
  • How do you know it works? Does it have tests?

Follow Up Work

We should make further changes in a following PR:

  • For releases, it should sync from scratch (1). Currently it's 2-4 weeks between releases, so a 3 day mainnet sync is ok.

  • For main branch pushes, it should have a cached state (2 or 3, whatever is easier or more reliable). We merge new PRs multiple times a day, so a cached state is needed.

@gustavovalverde gustavovalverde requested a review from a team as a code owner June 9, 2023 04:00
@gustavovalverde gustavovalverde requested review from dconnolly and removed request for a team June 9, 2023 04:00
@github-actions github-actions bot added C-bug Category: This is a bug C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG labels Jun 9, 2023
@gustavovalverde gustavovalverde added A-infrastructure Area: Infrastructure changes A-devops Area: Pipelines, CI/CD and Dockerfiles P-Critical 🚑 labels Jun 9, 2023
@teor2345
Copy link
Contributor

teor2345 commented Jun 9, 2023

  • For releases, it should sync from scratch (1). Currently it's 2-4 weeks between releases, so a 3 day mainnet sync is ok.

These further changes could be optional. For testing, it's enough to just run a scheduled full sync in our test environment. To support the network, using a cached state will be better because the instances are live faster.

This no longer requires the env variable to be defined in other places, unless we're changing the default configuration
teor2345
teor2345 previously approved these changes Jun 9, 2023
Copy link
Contributor

@dconnolly dconnolly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving; interesting if mode=rw matters, docs say it is the default when unspecified for both --create-disk and --container-mount-disk

mergify bot added a commit that referenced this pull request Jun 9, 2023
@mergify mergify bot merged commit 17d36ff into main Jun 9, 2023
@mergify mergify bot deleted the fix-migs-logging branch June 9, 2023 19:16
dconnolly pushed a commit that referenced this pull request Jun 12, 2023
* fix(deploy): allow the container to raise in MIGs

* fix(docker): add the `ZEBRA_CACHED_STATE_DIR` as a default `ENV`

This no longer requires the env variable to be defined in other places, unless we're changing the default configuration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles A-infrastructure Area: Infrastructure changes C-bug Category: This is a bug C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG
Projects
None yet
3 participants