Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boot_counter not decrementing? #135

Closed
vkrizan opened this issue Mar 13, 2024 · 14 comments · Fixed by #146
Closed

boot_counter not decrementing? #135

vkrizan opened this issue Mar 13, 2024 · 14 comments · Fixed by #146
Labels
enhancement New feature or request jira flow issues to jira

Comments

@vkrizan
Copy link

vkrizan commented Mar 13, 2024

Hello,

I'm failing to understand how the greenboot prevents for bootloops and how it decrements the boot_counter on each failure.
Only place I've found where the variable is decremented is in the static grub config:

decrement boot_counter

This seems to depend on bootupd, however with the RPM packages in regular Fedora (dist-git) and CentOS repositories the dependency on bootupd and the file greenboot.cfg grub file cannot be found.

Requires: bootupd

install -D -t %{buildroot}%{_prefix}/lib/bootupd/grub2-static/configs.d grub2/greenboot.cfg

The greenboot-grub2-set-counter script is only called once without a parameter...

How should this work outside of Fedora IoT? Or what am I missing?

Thank you.

@say-paul
Copy link
Member

say-paul commented Mar 13, 2024

what is the version of greenboot ?

This seems to depend on bootupd, however with the RPM packages in regular Fedora (dist-git) and CentOS repositories the dependency on bootupd and the file greenboot.cfg grub file cannot be found.

Greenboot needs a new release as version 0.15.4 does not have the bootup changes.

@vkrizan
Copy link
Author

vkrizan commented Mar 13, 2024

I was checking what the 0.15.4 included and the grub config seems to have been included. https://github.com/fedora-iot/greenboot/releases/tag/v0.15.4

Even before the PR #129, the project advertised and/or had the max boots limit. That is what confused me a bit more. What is the dependency on the bootupd?

@say-paul
Copy link
Member

previously image builder used to put the grub2/greenboot.conf , PR: #129 move that out from image builder into greenboot itself. So my understanding is new release will solve the issue. I haven't tested with the patch yet, so I can check do some tests and confirm.

@vkrizan
Copy link
Author

vkrizan commented Mar 13, 2024

Thank you.

I've tried to manually add the grub2/greenboot.conf to the right path and installed bootupd, but with no success. I'm not sure what else needs to be configured.

Do you happen to have the link to where image builder added that? Was that only for the Fedora IoT images?

Is my understanding correct that without bootupd the boot counting is not done (and bootupd is a dependency)? Was the counting (decrementing) moved to a grub config for convenience or some advantage, rather than counting it in the systemd targes within a shell script?

@say-paul
Copy link
Member

Do you happen to have the link to where image builder added that? Was that only for the Fedora IoT images?

https://github.com/osbuild/osbuild/blob/b29aa5e6517e017f545de54819aa845fb026fd1e/stages/org.osbuild.grub2#L321
the stage then gets hooked up in the image builder pipeline where greenboot is added as a default package.

@say-paul
Copy link
Member

Is my understanding correct that without bootupd the boot counting is not done (and bootupd is a dependency)? Was the counting (decrementing) moved to a grub config for convenience or some advantage, rather than counting it in the systemd targes within a shell script?

historically greenboot relied on grub to decrement boot_counter, bootupd came later and I have not tested with it yet.

@vkrizan
Copy link
Author

vkrizan commented Apr 2, 2024

Sill no luck with the boot_counter variable. I've installed the current code base (332e5e3) into a container image and added an exit 1 file for the /etc/greenboot/check/required.d/. I'm not sure if the greenboot.cfg is even loaded and run by Grub.

Here's my Containerfile:

FROM quay.io/centos-bootc/centos-bootc-cloud:stream9

RUN rpm-ostree install \
    https://download.copr.fedorainfracloud.org/results/vkrizan/greenboot/fedora-40-x86_64/07246945-greenboot/greenboot-0.15.4-1.fc40.x86_64.rpm \
    https://download.copr.fedorainfracloud.org/results/vkrizan/greenboot/fedora-40-x86_64/07246945-greenboot/greenboot-default-health-checks-0.15.4-1.fc40.x86_64.rpm \
    && systemctl enable greenboot-grub2-set-counter \
        greenboot-grub2-set-success.service greenboot-healthcheck.service \
        greenboot-loading-message.service greenboot-rpm-ostree-grub2-check-fallback.service \
        redboot-auto-reboot.service redboot-task-runner.service redboot.target \
    && ostree container commit

# Add the bad check: grub2-editenv list && exit 1
COPY --chmod=755 bad_check.sh /etc/greenboot/check/required.d/

Note that I'm using bootc switch to switch between the good (without the bad_check.sh) and bad image.
I've also set the counter grub2-editenv - set boot_counter=2) and reset the boot success grub2-editenv - set boot_success=0 before rebooting to the bad image.
After manually selecting second boot and inspecting journal of previous failed boots, I can clearly see that the variables are not changing.

@cgwalters would you happen to know what bootupd wizardry I am missing?

For clarity:

$ rpm -q bootupd
bootupd-202401222113.0.2.17.20.gc687978-1.el9.x86_64
$ rpm -ql greenboot | grep grub2-static
/usr/lib/bootupd/grub2-static/configs.d/greenboot.cfg

EDIT: The same applies for base quay.io/centos-bootc/fedora-bootc-cloud:eln

@cgwalters
Copy link
Contributor

RUN rpm-ostree install

(Unrelated but any reason why this versus RUN dnf install ?)

I'm not sure if the greenboot.cfg is even loaded and run by Grub.

Hmm, to verify look at the final configuration in /boot/grub2/grub.cfg and see if it's being pulled in.

@say-paul
Copy link
Member

say-paul commented Apr 2, 2024

@vkrizan can you check systemd status of all greenboot services and see if any error is reported there,
I think #136 needs to be resolved first.

@vkrizan
Copy link
Author

vkrizan commented Apr 3, 2024

Hmm, to verify look at the final configuration in /boot/grub2/grub.cfg and see if it's being pulled in.

I do not see it being included. Check https://pastebin.com/JLVsy5hw. I do not know when the grub config is being generated, and bootupd does not have much docs.

can you check systemd status of all greenboot services and see if any error is reported there,
I think #136 needs to be resolved first.

The greenboot services are all green when using the good image: https://pastebin.com/7JiQSZN5

The /boot is mounted as rw

$ mount | grep /boot
/dev/vda3 on /boot type ext4 (rw,relatime,seclabel)

Unless it has different conditions for the systemd units, this should not be an issue. And regardless of that, the boot_counter modifications are done by Grub.

(Unrelated but any reason why this versus RUN dnf install ?)

My mistake, as I saw use of the ostree commit I've stick to ostree commands (I guess the commit is then not needed). Anyhow, I've changed it to use dnf but with the recommended ostree container commit it fails with error: Found content in var even after dnf clean all. Probably it needs more cleanup or no ostree commit.

@vkrizan
Copy link
Author

vkrizan commented Apr 3, 2024

Is the bootupd stuff to be expected to be already injected by the initial image that the systems is first booted from? Could that be the expectation that is broken here, and subsequent bootc update/switch have no impact on it? Note, that I've started with fedora-boot-cloud.qcow2 using bootc-playground.

@cgwalters
Copy link
Contributor

Yes indeed, that's the reason; right now the bootloader state is not updated by bootc upgrade/switch.

That's what bootupctl update does, however even that at the current time does not update the static grub configs.

If you haven't I'd recommend trying https://gitlab.com/bootc-org/podman-bootc-cli which streamlines creating VMs directly from a container, without starting from an existing disk image.

@vkrizan
Copy link
Author

vkrizan commented Apr 3, 2024

Thank you. I'll try that one out.

Is there a way to force the bootupctl to update static grub configs, or a manual intervention of grub.cfg is the only choice atm?

@vkrizan
Copy link
Author

vkrizan commented Apr 4, 2024

@cgwalters podman-bootc run <imagename> helped with the bootloader. greenboot.cfg is included. However, the greenboot systemd units were all disabled, despite having them enabled on the container image.

Once the greenboot units were enabled, the rollback (currently using ostree) went as expected. Hence I can conclude the boot_counter was being decremented.

cc @say-paul

@say-paul say-paul added jira flow issues to jira enhancement New feature or request labels Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request jira flow issues to jira
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants