Skip to content

DLPX-74859 Stack fails to come up for GCP engines because delphix-platform service failed #275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 1, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,11 @@

---
#
# Apply any changes made to the override instance config file. This
# Apply any changes made to the instance config file. This
# can only be done when we're on a running gcp instance.
#
- command: /usr/bin/google_instance_setup
- systemd:
name: google-guest-agent
state: restarted
listen: "gcp config changed"
when: ansible_virtualization_type != "systemd-nspawn" and not ansible_is_chroot
Original file line number Diff line number Diff line change
Expand Up @@ -301,13 +301,18 @@
when: platform == "azure"

#
# Customize the GCP linux environment.
#
# Update the override file for the GCP instance. This file gets
# applied dynamically by running google_instance_setup script.
# Customize the GCP linux environment. We remove any old template
# and configuration files if they exist.
#
- file:
path: "{{ item }}"
state: absent
with_items:
- /etc/default/instance_configs.cfg.template
- /etc/default/instance_configs.cfg

- blockinfile:
path: /etc/default/instance_configs.cfg.template
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to undo any configuration that might have previously been applied to the .template file? E.g. remove the file? or remove the block we may have added in an earlier version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a meeting with google to discuss how the new packages changed behavior and they mentioned that they maintained the template file in the new packages too even though it's not documented. I decided that it would be best to remove both the template file and the cfg file and just recreate it as part of our ansible logic.

path: /etc/default/instance_configs.cfg
create: yes
block: |
#
Expand All @@ -328,14 +333,6 @@
- platform == "gcp"
notify: "gcp config changed"

#
# Make sure that the account daemon is always disabled. The override file
# above should prevent this and this is designed to catch any corner cases.
#
- command: systemctl disable google-accounts-daemon.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to undo any configuration that might have previously been applied? E.g. re-enable the service (assuming it's default state is to be enabled)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do anything as part of upgrade. The packaging remove the old services but does leave them masked:

root@gwilson-gcp:/var/tmp# systemctl list-unit-files \*google\*
UNIT FILE                        STATE
google-accounts-daemon.service   masked
google-clock-skew-daemon.service masked
google-guest-agent.service       enabled
google-instance-setup.service    masked
google-network-daemon.service    masked
google-oslogin-cache.service     static
google-shutdown-scripts.service  enabled
google-startup-scripts.service   enabled
google-oslogin-cache.timer       enabled

If someone ever unmasks the service then it just gets removed.

when:
- platform == "gcp"

#
# We want the ssh service to start as early as possible during boot up,
# so we explicitly remove all dependencies of the service here. We've
Expand Down