Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

05core: add coreos-network-initramfs-etc.service #1374

Closed
wants to merge 3 commits into from

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Dec 15, 2021

A papercut right now is that when using Tang, one must use kernel
arguments to specify network configuration because network is required
on every boot.

This breaks the UX we're trying to emphasize, which is towards
specifying networking settings using NM keyfiles once and having it
apply everywhere.

Let's fix this by automatically detecting this case and enabling
initramfs-etc tracking of NM keyfiles on first boot. This will allow
users to stay on the NM keyfile path and have it transparently work even
when using Tang.

The new service does this from the real root. Long-term, I'd like to
investigate doing it earlier in the initramfs (this is related to
coreos/rpm-ostree#3006).

Modifications to NM keyfiles in the real root are automatically synced
on every new deployment, or explicitly when using
rpm-ostree ex initramfs-etc --force-sync.

If we propagated the initramfs networking config to the real root, then
create the `/run/coreos/network-propagated` stamp file. This can be used
by real root code to determine whether the network configuration found
in `/etc/NetworkManager/system-connections` came from the initramfs or
not.
This can be used by real root code to determine if network configuration
files were copied from either `/boot` or `/etc` into the initrd `/run`.
A papercut right now is that when using Tang, one must use kernel
arguments to specify network configuration because network is required
on every boot.

This breaks the UX we're trying to emphasize, which is towards
specifying networking settings using NM keyfiles *once* and having it
apply everywhere.

Let's fix this by automatically detecting this case and enabling
`initramfs-etc` tracking of NM keyfiles on first boot. This will allow
users to stay on the NM keyfile path and have it transparently work even
when using Tang.

The new service does this from the real root. Long-term, I'd like to
investigate doing it earlier in the initramfs (this is related to
coreos/rpm-ostree#3006).

Modifications to NM keyfiles in the real root are automatically synced
on every new deployment, or explicitly when using
`rpm-ostree ex initramfs-etc --force-sync`.
@jlebon
Copy link
Member Author

jlebon commented Dec 15, 2021

Note this is softer than the proposal in coreos/fedora-coreos-tracker#886, which is asking for doing this even if the network configuration comes from kernel arguments. The problem with that though is that we need to be able to distinguish between first boot kernel arguments (i.e. the auto-forwarding magic in coreos-installer.service) and persistent kernel arguments (i.e. --append-karg). Because in the latter case, I think it'd be too confusing to have both kernel arguments and NM keyfiles in the initramfs.

So this is optimized for the NM keyfile path, which is what we want to emphasize.

Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given this an initial run through. Obviously I want to think critically about any additions to our the networking "house of cards" we've built so give me some time to see if I think of anything else. For now here are some questions/comments.

Comment on lines +111 to +112
mkdir -p /run/coreos
touch /run/coreos/network-propagated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this I think we should add the context from the commit message in a comment

Comment on lines +25 to +26
mkdir -p /run/coreos
echo "${src}" > /run/coreos/firstboot-network-src
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this I think we should add the context from the commit message in a comment

Comment on lines +19 to +21
if [[ $(karg rd.neednet) == 1 ]]; then
rpm-ostree ex initramfs-etc --track /etc/NetworkManager/system-connections
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like where this gets ugly and we need to be a lot more careful about enabling it. For example, what if someone has rd.neednet=1 and also ip=xyz? Now our initramfs /etc/NetworkManager/system-connections connections are battling against the ones created by nm-initrd-generator. Also, what if there are no files in /etc/NetworkManager/system-connections because the system just uses DHCP? Or what if someone haphazardly added --append-kargs=rd.neednet and isn't using tang?

Also I think rd.neednet and rd.neednet=1 are equivalent.

Sorry I just have a lot of open questions right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the 2nd question is taken care of by ConditionDirectoryNotEmpty=/etc/NetworkManager/system-connections/.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer to the third question is "then they want networking in the initrd, and they'll get it".

For the first question, if they didn't supply firstboot NM keyfiles, the unit conditions will ensure we'll never get here. If they did, and they specified ip=, ...do we need to handle that case? Or can we just say "don't do that"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, what if someone has rd.neednet=1 and also ip=xyz? Now our initramfs /etc/NetworkManager/system-connections connections are battling against the ones created by nm-initrd-generator.

Yeah, indeed. But worth noting that any network kargs will not have taken effect on first boot at least because our current stance is that firstboot NM keyfiles win. The only use case I can think of where they'd use both network kargs and NM keyfiles is if they want the same configuration between the initrd and the real root on first boot, but not on subsequent boots... which would be weird. I guess one example is they add a NIC after install and they want to e.g. make sure that it stays off in the initrd? Anyway, high-level, yes if the kargs describe a configuration for devices that overlap with the /etc configuration, then there could be conflicts. I'm not sure if NM will let the /run one win, but I feel OK just documenting that you should stop tracking NM keyfiles if you fall in that category.

Or what if someone haphazardly added --append-kargs=rd.neednet and isn't using tang?

Essentially, what this is proposing is setting a new default behaviour (which purposely isn't specific to Tang): if the system needs networking in the initramfs, and you specified NM keyfiles, then we assume that you want that configuration to apply on not just the first boot, but every boot. It seems like a reasonable assumption (and IMO less surprising than not doing it). You can disable this default by switching only to kargs, or disabling the service (at install time), or using rpm-ostree initramfs-etc --untrack (for later boots).

Related to this, in the future I'd like to also add support to rpm-ostree initramfs-etc to track files that aren't necessarily in the same directory, so e.g. they could do --track /etc/NetworkManager/system-connections-initrd:/etc/NetworkManager/system-connections. And then there'd be even less need for network kargs.

Comment on lines +13 to +15
# and they come from `--copy-network` or `iso network embed`,
ConditionPathExists=/run/coreos/firstboot-network-src
ConditionPathExists=/run/coreos/network-propagated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird to require these. Does it matter where the files came from as long as they end up in /etc/NetworkManager/system-connections/?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIUI: if they came from Ignition, then they weren't needed in the initrd during first boot, so they shouldn't be needed in the initrd in later boots. If they came from kargs, then the disposition of those kargs should govern: either they'll exist on every boot, or they won't and we shouldn't propagate their effects forward.

I guess the network-propagated condition is for the corner case where the user provided some NM configs, but they looked default-y so we didn't copy them forward? If so, let's mention that.

Comment on lines +13 to +15
# and they come from `--copy-network` or `iso network embed`,
ConditionPathExists=/run/coreos/firstboot-network-src
ConditionPathExists=/run/coreos/network-propagated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIUI: if they came from Ignition, then they weren't needed in the initrd during first boot, so they shouldn't be needed in the initrd in later boots. If they came from kargs, then the disposition of those kargs should govern: either they'll exist on every boot, or they won't and we shouldn't propagate their effects forward.

I guess the network-propagated condition is for the corner case where the user provided some NM configs, but they looked default-y so we didn't copy them forward? If so, let's mention that.


karg() {
local name="$1" value="${2:-}"
local cmdline=( $(rpm-ostree kargs) )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We paste this function all over the place, so it might be worth calling out that we're not reading /proc/cmdline here.

Comment on lines +19 to +21
if [[ $(karg rd.neednet) == 1 ]]; then
rpm-ostree ex initramfs-etc --track /etc/NetworkManager/system-connections
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer to the third question is "then they want networking in the initrd, and they'll get it".

For the first question, if they didn't supply firstboot NM keyfiles, the unit conditions will ensure we'll never get here. If they did, and they specified ip=, ...do we need to handle that case? Or can we just say "don't do that"?

@jlebon
Copy link
Member Author

jlebon commented Dec 17, 2021

Spoke to @dustymabe about this. Some concerns were raised around the conditionals in place. Some things we discussed:

  • We should try to cover the magic kargs auto-forwarding case too. To do this, we could have coreos-installer-service add another firstboot karg to signal that forwarding occurred and we can detect that.
  • We shouldn't care where the NM keyfiles came from firstboot kargs or NM keyfiles or target Ignition config proper, so we should be able to drop the conditional on /run/coreos/network-propagated.
  • If users use both --copy-network and --append-karg and use Tang, then we'll turn on initramfs-etc, but the persistent kargs will also be there. It's kind of a weird scenario we don't expect users to use because the persistent kargs will have no effect on first boot. But if they do (could imagine this as a day-2 operation), they will be there on subsequent boots where they may conflict with the overlayed NM configs. We should try to see if there's an elegant way to address this (which ideally doesn't involve detecting network kargs again like we do in coreos-installer-service). This might also lead to less conditionals.

@jlebon
Copy link
Member Author

jlebon commented Sep 8, 2023

I'll close this for now. The suboptimal status quo is described in the still-open coreos/fedora-coreos-tracker#886 but there's a real cost too to adding yet more complexity to our networking code.

@jlebon jlebon closed this Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants