Skip to content

Commit

Permalink
Fix race condition in reboot flag test
Browse files Browse the repository at this point in the history
PR projectatomic#368 added a check for a boot_id to confirm reboots in the reboot
role.  This caused a race condition in the rpm_ostree_install and
rpm_ostree_uninstall roles when using the reboot flag.

The reboot role has an option to not perform a reboot and just check
that the system comes down and back up.  This was leverage by
rpm_ostree_install and rpm_ostree_uninstall roles to fire the the
respective commands with the -r flag.  The -r flag causes a reboot to
occur when the command executes but requires to be run asychronously
(the command won't return when the system goes down and will cause
Ansible to fail).  When rpm_ostree_install/rpm_ostree_uninstall was
called with the -r flag it calls the reboot role.  If the reboot role
can execute before the reboot occurs, it will successfully execute.
If the system goes down before the reboot role can grab the boot_id,
it will fail.

This PR modifies the reboot_flag test to not use the reboot role.  It
copies most of the logic from the reboot role into the test itself.
Since this was a corner case for the -r flag, I felt like it was an
appropriate exception to not re-use the role.
  • Loading branch information
mike-nguyen authored and Micah Abbott committed May 22, 2018
1 parent 3d83c27 commit 32e9af1
Showing 1 changed file with 67 additions and 10 deletions.
77 changes: 67 additions & 10 deletions tests/pkg-layering/reboot_flag.yml
Original file line number Diff line number Diff line change
@@ -1,24 +1,81 @@
---
# set ft=ansible
#
- set_fact:
real_ansible_host: "{{ ansible_host }}"
timeout: "{{ cli_reboot_timeout | default('300') }}"

- import_role:
name: rpm_ostree_install
vars:
roi_packages: "{{ g_pkg1 }}"
roi_reboot: true
# Have to account for both because Fedora STR uses the old version of these
# inventory values for some reason.
- when: ansible_port is defined
set_fact:
real_ansible_port: "{{ ansible_port }}"

- when: ansible_ssh_port is defined
set_fact:
real_ansible_port: "{{ ansible_ssh_port }}"

- name: Get original bootid
command: cat /proc/sys/kernel/random/boot_id
register: orig_bootid

- name: Package layer {{ g_pkg1 }} with reboot flag
command: rpm-ostree install {{ g_pkg1 | quote }} -r
async: 60
poll: 0
ignore_errors: true

- name: wait for hosts to come back up
local_action:
wait_for host={{ real_ansible_host }}
port={{ real_ansible_port | default('22') }}
state=started
delay=30
timeout={{ timeout }}
search_regex="OpenSSH"
become: false

# I'm not sure the retries are even necessary, but I'm keeping them in
- name: Wait until bootid changes
command: cat /proc/sys/kernel/random/boot_id
register: new_bootid
until: new_bootid.stdout != orig_bootid.stdout
retries: 6
delay: 10

- import_role:
name: rpm_ostree_install_verify
vars:
roiv_package_name: "{{ g_pkg1 }}"
roiv_binary_name: "{{ g_pkg1 }}"

- import_role:
name: rpm_ostree_uninstall
vars:
rou_packages: "{{ g_pkg1 }}"
rou_reboot: true
- name: Get original bootid
command: cat /proc/sys/kernel/random/boot_id
register: orig_bootid

- name: Remove {{ g_pkg1 }} with reboot flag
command: rpm-ostree uninstall {{ g_pkg1 | quote }} -r
async: 60
poll: 0
ignore_errors: true

- name: wait for hosts to come back up
local_action:
wait_for host={{ real_ansible_host }}
port={{ real_ansible_port | default('22') }}
state=started
delay=30
timeout={{ timeout }}
search_regex="OpenSSH"
become: false

# I'm not sure the retries are even necessary, but I'm keeping them in
- name: Wait until bootid changes
command: cat /proc/sys/kernel/random/boot_id
register: new_bootid
until: new_bootid.stdout != orig_bootid.stdout
retries: 6
delay: 10

- import_role:
name: rpm_ostree_uninstall_verify
Expand Down

0 comments on commit 32e9af1

Please sign in to comment.