Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly identify the booting system with sentinel files #2226

Closed
Tracked by #1791
vipsharm opened this issue Feb 8, 2024 · 15 comments
Closed
Tracked by #1791

Correctly identify the booting system with sentinel files #2226

vipsharm opened this issue Feb 8, 2024 · 15 comments
Assignees
Labels
bug Something isn't working regression uki

Comments

@vipsharm
Copy link
Collaborator

vipsharm commented Feb 8, 2024

In Kairos we use to have sentinel files either if we are booting for instance into active, passive, recovery or livecd. This is now not respected anymore when booting with UKI files.

Boot mode shows as unknown in /run/cos and kairos-agent get mode command.

Kairos Version: v3.0.0-alpha2

[kairos@ ~]$ ls -ltr /run/cos
total 12
-rwxr-xr-x 1 root root   1 Feb  8 00:29 unknown
-rwxr-xr-x 1 root root   1 Feb  8 00:29 uki_boot_mode
-rw-r--r-- 1 root root 534 Feb  8 00:29 cos-layout.env
[kairos@ ~]$ 

[kairos@ ~]$ kairos-agent state get boot; echo
unknown
[kairos@ ~]$ 
@vipsharm vipsharm added the bug Something isn't working label Feb 8, 2024
@mudler mudler removed their assignment Feb 8, 2024
@mudler mudler mentioned this issue Feb 8, 2024
53 tasks
@jimmykarily jimmykarily self-assigned this Feb 8, 2024
@jimmykarily
Copy link
Contributor

@jimmykarily
Copy link
Contributor

The UKI booted system has this cmdline:

image

and based on the detection code above, the detection code doesn't match any of the expected strings, thus the "unknown".

@jimmykarily
Copy link
Contributor

@Itxaka is it the detectBoot that needs to change so that it detects "COS_OEM + install-mode" as uki_install_mode or is it the cmdline that needs to change to have live:LABEL or one of the others for LiveCD?

@Itxaka
Copy link
Member

Itxaka commented Feb 8, 2024

@Itxaka is it the detectBoot that needs to change so that it detects "COS_OEM + install-mode" as uki_install_mode or is it the cmdline that needs to change to have live:LABEL or one of the others for LiveCD?

detectBoot should be the correct one to change it, currently its set by immucore on boot using the sdk but with extra stuff to detect which mode we are in, that should be moved into the sdk:

detect that we are on uki mode: https://github.com/kairos-io/immucore/blob/master/pkg/mount/dag_steps.go#L386
detect if we are on live media (cdrom, netboot, usb) or booting from installed system: https://github.com/kairos-io/immucore/blob/master/internal/utils/common.go#L244

Hopefully the change would be there, otherwise immucore can be changed to just not write the unknown sentinel which is what its currently happening, they were separated from the start to not break anything that might rely into that sentinel: https://github.com/kairos-io/immucore/blob/master/pkg/mount/dag_steps.go#L375

@mudler
Copy link
Member

mudler commented Feb 8, 2024

Just for reference: we need to detect if we are either booting into active, passive, livecd or recovery. Basically we need to map the standard Kairos states of the system, but also in UKI mode.

@mudler mudler changed the title Boot mode shows as unknown in /run/cos after the install in UKI mode Correctly identify the booting system with sentinel files Feb 8, 2024
@jimmykarily
Copy link
Contributor

Just for reference: we need to detect if we are either booting into active, passive, livecd or recovery. Basically we need to map the standard Kairos states of the system, but also in UKI mode.

You are right. And it relates to this: #2051 (comment) , right?
In other words, we should not rely on the cmdline to tell which state we are booted in. For boot mode we can keep using the cmdline (all uki files will come with a cmdline that says "uki mode").

I wonder if this story depends on #2051 then.

@mudler
Copy link
Member

mudler commented Feb 8, 2024

Just for reference: we need to detect if we are either booting into active, passive, livecd or recovery. Basically we need to map the standard Kairos states of the system, but also in UKI mode.

You are right. And it relates to this: #2051 (comment) , right? In other words, we should not rely on the cmdline to tell which state we are booted in. For boot mode we can keep using the cmdline (all uki files will come with a cmdline that says "uki mode").

I wonder if this story depends on #2051 then.

correct, maybe we should actually extract the state identification from the reset (to not overload the card)

@Itxaka
Copy link
Member

Itxaka commented Feb 8, 2024

According to the discussion in the channel:

  • maybe we should drop an active.efi,passive.efi and recovery.efi files in there (no versions? what about extra cmdlines how this becomes a matrix of 3 states + extra cmdlines???)
  • maybe we should parse the EfiVar LoaderEntrySelected which shows the conf file used and identify the running mode by that (v3.0.0_recovery.efi for example)
  • the modify the upgrade to support this way (so move active.efi into passive.efi, open upgrade container, indentify which one is the active if there is more than one efi file (HOW???), then copy that into active.efi)
  • uki mode still can be and should be identify by the rd.immucore.uki cmdline I guess? Or we can go more complex and use the LoaderEntrySelected to know if we are booting in efi mode from an efi file

From my point of view is clear that there 2 sentinels needed, one for active/passive/recovery and one for uki as otherwise there is not enough info for the default config files to do the proper thing. There are things that we may run in active non-uki that we cannot run in active-uki, so it makes sense to have more than 1 sentinel otherwise a lot of issues are gonna arise (hence the need for a uki sentinel in the first place)

Maybe we just set the liveCD when booting from media and a uki sentinel for any uki boot?
So you can have combinations:

  • livecd + uki = uki boot from install media
  • active/passive/recovery + uki = uki boot from hdd

@mudler
Copy link
Member

mudler commented Feb 9, 2024

According to the discussion in the channel:

* maybe we should drop an active.efi,passive.efi and recovery.efi files in there (no versions? what about extra cmdlines how this becomes a matrix of 3 states + extra cmdlines???)

* maybe we should parse the EfiVar LoaderEntrySelected which shows the conf file used and identify the running mode by that (v3.0.0_recovery.efi for example)

* the modify the upgrade to support this way (so move active.efi into passive.efi, open upgrade container, indentify which one is the active if there is more than one efi file (HOW???), then copy that into active.efi)

* uki mode still can be and should be identify by the rd.immucore.uki cmdline I guess? Or we can go more complex and use the LoaderEntrySelected to know if we are booting in efi mode from an efi file

From my point of view is clear that there 2 sentinels needed, one for active/passive/recovery and one for uki as otherwise there is not enough info for the default config files to do the proper thing. There are things that we may run in active non-uki that we cannot run in active-uki, so it makes sense to have more than 1 sentinel otherwise a lot of issues are gonna arise (hence the need for a uki sentinel in the first place)

Maybe we just set the liveCD when booting from media and a uki sentinel for any uki boot? So you can have combinations:

* livecd + uki = uki boot from install media

* active/passive/recovery + uki = uki boot from hdd

from an operational standpoint, the UKI sentinel is needed to have, but we really need the sentinel that a Kairos GRUB-based system has, as many of the cloud configs stages depend on that.

That means to have a sentinel file for each of the state (active state sentinel, recovery mode sentinel, live cd sentinel, ... and so on so forth).

@Itxaka
Copy link
Member

Itxaka commented Feb 9, 2024

According to the discussion in the channel:

* maybe we should drop an active.efi,passive.efi and recovery.efi files in there (no versions? what about extra cmdlines how this becomes a matrix of 3 states + extra cmdlines???)

* maybe we should parse the EfiVar LoaderEntrySelected which shows the conf file used and identify the running mode by that (v3.0.0_recovery.efi for example)

* the modify the upgrade to support this way (so move active.efi into passive.efi, open upgrade container, indentify which one is the active if there is more than one efi file (HOW???), then copy that into active.efi)

* uki mode still can be and should be identify by the rd.immucore.uki cmdline I guess? Or we can go more complex and use the LoaderEntrySelected to know if we are booting in efi mode from an efi file

From my point of view is clear that there 2 sentinels needed, one for active/passive/recovery and one for uki as otherwise there is not enough info for the default config files to do the proper thing. There are things that we may run in active non-uki that we cannot run in active-uki, so it makes sense to have more than 1 sentinel otherwise a lot of issues are gonna arise (hence the need for a uki sentinel in the first place)
Maybe we just set the liveCD when booting from media and a uki sentinel for any uki boot? So you can have combinations:

* livecd + uki = uki boot from install media

* active/passive/recovery + uki = uki boot from hdd

from an operational standpoint, the UKI sentinel is needed to have, but we really need the sentinel that a Kairos GRUB-based system has, as many of the cloud configs stages depend on that.

That means to have a sentinel file for each of the state (active state sentinel, recovery mode sentinel, live cd sentinel, ... and so on so forth).

yes, that was my point, "normal sentinel" + uki sentinel so we know we are on uki and with that combination we can know if we are on uki livecd or uki installed system

@jimmykarily
Copy link
Contributor

jimmykarily commented Feb 9, 2024

WIP branch for kairos-sdk: 2226-detect-boot-state

@jimmykarily
Copy link
Contributor

I've found out (the hard way) that:

  • the /sys/firmware/efi/efivars/LoaderEntrySelected-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f file as non-printable characters in it (which had to be cleaned before comparing to active.conf). Seems to be coming from here: https://github.com/systemd/systemd/blob/63091fa38cf39edb39cb7f2cecd48f90ce36940f/src/boot/efi/boot.c#L2448 but I don't know why.
  • the /sys/firmware/efi/efivars/LoaderDevicePartUUID-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f is not there when immucore checks if we are on livecd boot but appears later (I don't know why yet).

@jimmykarily
Copy link
Contributor

With the various PRs integrated in one image, it now correctly identifies live mode:

[root@fedora ~]# kairos-agent state get boot; echo
livecd_boot

active mode:

[kairos@fedora ~]$ kairos-agent state get boot; echo
active_boot

passive mode:

[kairos@fedora ~]$ kairos-agent state get boot; echo
passive_boot

and recovery:

[kairos@cos-recovery ~]$ kairos-agent state get boot; echo
recovery_boot

@jimmykarily
Copy link
Contributor

All PRs now merged. Keeping this ticket open, until we bump immucore and kairos-agent on this repository.

@jimmykarily
Copy link
Contributor

jimmykarily commented Feb 20, 2024

All merged and working. Closing this.

I built an image with:

~/workspace/kairos/enki (main)*$ CGO_ENABLED=false go build -o build/enki   && docker run --rm --entrypoint /bin/enki -v $PWD/build/enki:/bin/enki  -v $PWD/build:/result -v $PWD/e2e/assets/keys/:/keys enki-image build-uki quay.io/kairos/fedora:38-core-amd64-generic-v3.0.0-alpha3 --output-dir /result --keys /keys --output-type iso --boot-branding "DimitrisOS"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working regression uki
Projects
Archived in project
Development

No branches or pull requests

4 participants