Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying a guest OS in podman machine init. #19939

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

davidchisnall
Copy link

This is propagated through the VM initialisation but is currently unused. This is the first step towards properly supporting multiple guest types (FreeBSD, Windows, and so on in addition to Linux).

Does this PR introduce a user-facing change?

Disclaimer: I haven't written Go for about 12 years and this is my first change to Podman, so I am almost certainly doing things in a non-idiomatic way and in the wrong places. I'm raising this PR now so that I don't keep pushing in the wrong direction.

Yes. This is the first in a set of changes required to be able to run a FreeBSD VM using podman machine commands and use that to run containers. With this and the changes in this tree, I am able to use podman to manage a FreeBSD VM and run FreeBSD containers on a Mac. I would like to upstream these changes.

This is the first patch in the series that teaches the machine parts that a VM may be one of a set of different operating systems (at some point, presumably, people may want to support a Windows VM for Windows containers). For Linux, I've tried to separate out the distro. Currently, for example, there are a lot of FCOS assumptions in the systemd units that are injected via the ignition file and would break on a Debian-based distro, but the mount command should work with any Linux.

This first commit does almost nothing with the OS identifier, except raise an error if you specify something other that FCOS but don't provide an image (this provides an extension point for plugging in downloads for other VM images).

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 12, 2023

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Sep 12, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 12, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: davidchisnall
Once this PR has been reviewed and has the lgtm label, please assign umohnani8 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@davidchisnall davidchisnall force-pushed the machine-os branch 2 times, most recently from d38f945 to af0e494 Compare September 12, 2023 10:09
@Luap99
Copy link
Member

Luap99 commented Sep 12, 2023

Thanks for the contribution but I am not sure if this is a direction we would like to go.
Podman machine is primarily a simple way to run podman containers on windows/macos by using a VM.
We are already working on multiple hypervisor support and this is a lot of work, supporting different guest OS'es directly would just explode the support matrix way beyond of what we can handle. Podman machine is not a general purpose VM manger.

There is nothing stopping you from running podman in your own VMs with any OS and any hypervisor you like. You can setup the remote connection manually with podman system connection.

Before doing any more work it would make sense to first get an agreement from us maintainers whenever we would even accept a change like that.

@baude @containers/podman-maintainers PTAL

@giuseppe
Copy link
Member

IMO giving users the freedom to pick something else could be possible, as long as it is clear we don't support it

@davidchisnall
Copy link
Author

Thanks for the contribution but I am not sure if this is a direction we would like to go.

That's why I opened the PR early.

Podman machine is primarily a simple way to run podman containers on windows/macos by using a VM.

That is, indeed, my use case. I wish to use Podman on macOS to run FreeBSD containers (this works with my patch set). Running Windows containers on a Linux host is a similar use case that would build on top of the same abstractions.

We are already working on multiple hypervisor support and this is a lot of work, supporting different guest OS'es directly would just explode the support matrix way beyond of what we can handle

You currently support a user-provided VM image and make no guarantees that this will work. I am not proposing any stronger guarantees from Podman than are provided today.

There is nothing stopping you from running podman in your own VMs with any OS and any hypervisor you like.

This is true, but there are some assumptions about how host filesystems are shared into the VM and so on that add complexity if the VM is managed externally. It's also hard to integrate with anything that needs to manage the lifecycle of that VM if it is not managed by podman.

const (
CoreOS GuestOS = iota
// Any unknown Linux
GenericLinux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better for everyone if this was just called "Linux", like in uname and like everywhere else (like Lima)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not attached to any of the naming, but I wanted to decouple specific-Linux-distribution from unknown-may-not-work-but-let's-try-it-anyway-Linux-distribution. Happy to call this Linux (though, note, it's a distribution here, not a kernel, so uname output is not relevant, lsb_release output might be) if that's preferable. It's there so that people bringing their own Linux distro can fall through into generic Linux code paths where appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I scrolled down to the Kernel constants. Oh well, then maybe it would still apply.

After all, if systemd can call the file /etc/os-release then we can call a distribution for an OS

type GuestOS uint32

const (
CoreOS GuestOS = iota
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoreOS is a Linux distribution, not sure why it needs a separate "GuestOS" from others (like Fedora, etc)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this comment, please can you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are "CoreOS" and "GenericLinux" (and so on, "Fedora" or "CentOS") different guest operating systems ?

But I think you explained it above, it is about the distribution and not only about the operating system.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, there are things that make assumptions about, for example, dnf or systemd, that are per OS (which, in the Linux world, typically means distro), that someone may want to disentangle at some point, so I wanted to leave that as an extension point. Happy to make it simply Linux / FreeBSD [/ Windows / Fuchsia / whatever] if this isn't something that people want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be my preference, the rest is just different images IMHO.

@afbjorklund
Copy link
Contributor

@davidchisnall : we added a similar feature in Lima

# OS: "Linux", "FreeBSD".
# 🟢 Builtin default: "Linux"
os: null

Here is the proof-of-concept, for running CoreOS or FreeBSD VMs

@afbjorklund

This comment was marked as off-topic.

@dfr
Copy link
Contributor

dfr commented Sep 12, 2023

Podman machine is primarily a simple way to run podman containers on windows/macos by using a VM.

As far as I know, Podman does not even support other distributions - let alone other operating systems...

https://en.wikiquote.org/wiki/Henry_Ford#/media/File:1910Ford-T.jpg

Podman does work on FreeBSD quite well. I wouldn't say that this is supported by the project but they have been very supportive of my FreeBSD changes over the last year.

Copy link
Contributor

@dfr dfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the lint failures are because this new flag is not mentioned in the man page.

@baude
Copy link
Member

baude commented Sep 12, 2023

We do allow for you to specify an image via command line already. We also allow for injection of an ignition file as well.

as others have mentioned, the purpose of podman machine really is about running containers. your idea is novel however, im just not sure about how it impacts the long-term strategy of podman machine.

do you know what, in total, changes would be needed to support the freebsd idea?

@davidchisnall
Copy link
Author

davidchisnall commented Sep 13, 2023

Again, to be clear, I am not using Podman machine for anything other than its intended use case: running containers that are managed by Podman.

The branch that I linked to has the full set of changes required to be able to have Podman run FreeBSD containers on macOS.

It contains three patches, of which this is the largest. One is required as a work around for FreeBSD not properly handling the ACPI shutdown event from QEMU (this works on x86 so it’s probably a fairly small fix for AArch64) and should ideally be fixed there. Ideally I’d like to upstream that fix (there is no way to cleanly shut down the managed VM without it) and revert it once that’s fixed in FreeBSD.

The other change is related to mounting the exported shares. Podman connects to the guest VM and issues mount commands using Linux-specific commands. I factored that out into Linux and FreeBSD variants. Without any of these changes, I can run FreeBSD containers on macOS with Podman, I just can’t mount shares from the host (so, for example, I can’t easily use them as dev containers, which is one of my personal use cases).

Longer term, I’d like to make this work with other hypervisors. Once the interface is stable the FreeBSD project can start building the Podman VM images and at that point I would like to add support for automatically fetching them from upstream, rather than requiring users to build their own.

I believe there is some interest from others for running Windows containers in a Windows VM on Mac and Linux systems and so I aimed in this PR to provide abstractions that would enable this later. I would also like to build on this PR to add support for managing multiple guest VMs. The OCI runtime spec describes Linux, Windows, Solaris, and System/Z containers, with the FreeBSD spec under development. I would like Podman to be able to run containers for any of these systems if the vendors provide appropriate VM images for it to use.

@dfr
Copy link
Contributor

dfr commented Sep 13, 2023

As @davidchisnall mentioned, this PR is a starting point for two other fairly small changes to enable running FreeBSD native podman inside a suitable VM. As the FreeBSD port of podman stabilises, adding this support to 'podman machine' should make it easier for people to try out the port.

@baude
Copy link
Member

baude commented Sep 13, 2023

how would customizations be done on boot for freebsd?

@dfr
Copy link
Contributor

dfr commented Sep 13, 2023

how would customizations be done on boot for freebsd?

David is best placed to answer that but I think its via the ignition file as for Linux?

@davidchisnall
Copy link
Author

how would customizations be done on boot for freebsd?

What customisations do you have in mind? I am currently building the FreeBSD VM with these scripts. They consume enough of the ignition file that Podman produces to provision the SSH keys. They could consume more, but most of the ignition file is either installing things that we preinstall (e.g. podman) or setting up systemd things (which don't make sense on platforms that do not use systemd).

There's more work to be done there to reduce the size of the VM images (for example, the host system kernel doesn't need loads of device drivers for things that don't exist in the VMs and the userland doesn't need a C/C++ toolchain).

In the medium term, I would like the FreeBSD release engineering team to take over building these just as they do the VM images for various cloud platforms. At that point, I would like to teach Podman to fetch them from the authoritative source.

@afbjorklund
Copy link
Contributor

afbjorklund commented Sep 13, 2023

I think there is some upstream work ongoing to have cloud-init work with the FreeBSD images

https://www.freebsd.org/status/report-2023-01-2023-03/#_freebsd_as_a_tier_1_cloud_init_platform

@Luap99
Copy link
Member

Luap99 commented Sep 13, 2023

What customisations do you have in mind?

How do you configure the timezone, the env vars must be copied from the host into the VM, the SSL certs must be copied as well and don't forget to configure containers.conf, registries.conf maybe more config files?
Most of this is not systemd specific.

Offering proper support for this is not trivial for us , machine code is still under heavy development with other hypervisor work. And if we say we do not official support it then why should we merge it in the first place? Something like this will put even more burden on us maintainers, code needs to be refactored, moved, tested, etc...

I am still missing why this should to happen inside podman machine vs another generic VM manger?
I think our development focus in primarily on podman the container manager and I see podman machine as an easy way to get a podman container development setup going on windows and macos. I am not necessarily convinced that customizing the OS is important for that role.

@dfr
Copy link
Contributor

dfr commented Sep 14, 2023

One use-case I had in mind for this was making it possible to use podman desktop with FreeBSD podman which I would personally find quite useful. The total size of that change from this PR and two other changes in David's queue is less than 200 lines of code changed.

As to the maintenance burden for having FreeBSD code in the tree, I'm always happy to help out, including fixing any issues which may arise from other linux-focused work. My goal for this is to get as much of podman as possible to work well on FreeBSD while keeping friction for the main linux support as small as possible.

@davidchisnall
Copy link
Author

Thanks @dfr. I have two main use cases in mind:

  • Using local dev containers for FreeBSD work. Docker Desktop makes this easy for Linux, Podman now does as well. With Docker, the VM-management part is in the proprietary part not the open moby component, so it's impossible to extend theirs to support FreeBSD.
  • Working on FreeBSD containers locally to deploy into the cloud.

If you s/FreeBSD/Linux/, these are existing use cases for Podman so I am really surprised at the negativity in this PR: my goals seem to be completely aligned with stated goals of Podman.

How do you configure the timezone, the env vars must be copied from the host into the VM, the SSL certs must be copied as well and don't forget to configure containers.conf, registries.conf maybe more config files?
Most of this is not systemd specific.

If it's in the ignition file, we can extract it. 90% of my work here has been automating the creation of FreeBSD VM images that meet the requirements of the podman machine interface to minimise the changes needed in podman.

Before I started working on this, I posted a question on the GitHub discussions asking about the interface between the host and the VM and got no replies. I looked for a design document and didn't find one, so I read the code to try to understand what it was doing. I assumed that most of those things were provided over the podman socket (do host environment variables really leak into the container? That seems like a security hole.) but if there's anything that needs to be provisioned on the guest then we can extend the VM images to extract it and put it in the right place.

With these patches, I can do podman image pull, podman container run and podman build to fetch FreeBSD container images, run containers based on them, and build FreeBSD containers from a ContainerFile or DockerFile, on a Mac.

Offering proper support for this is not trivial for us , machine code is still under heavy development with other hypervisor work. And if we say we do not official support it then why should we merge it in the first place? Something like this will put even more burden on us maintainers, code needs to be refactored, moved, tested, etc...

If it's merged, a Mac user can do brew install podman and get something that they can use. If it isn't merged, I either need to create a podman-not-linux-only Homebrew package and confuse people doing the installation, or they need to build from source. Their podman will then not be the one that Podman Desktop uses, and so on.

You have two people volunteering to help maintain this code, which for something this small should be more than adequate.

I am still missing why this should to happen inside podman machine vs another generic VM manger?

Why does managing a Linux VM happen inside podman machine? Exactly the same arguments will apply because the use cases are identical.

I think our development focus in primarily on podman the container manager and I see podman machine as an easy way to get a podman container development setup going on windows and macos. I am not necessarily convinced that customizing the OS is important for that role.

Again, that is 100% aligned with my objectives. I am using this so that I can have Podman manage the lifecycle of a VM that runs containers. Unless someone writes a FreeBSD compat layer for Linux, a Linux VM cannot run FreeBSD containers (the converse is not true, a lot of Linux containers work quite happily on a FreeBSD VM with the Linux compat ABI).

@afbjorklund
Copy link
Contributor

Unless someone writes a FreeBSD compat layer for Linux, a Linux VM cannot run FreeBSD containers

For the especially interested: https://medium.com/nttlabs/linux-subsystem-for-freebsd-500b9a88fda4

@dfr
Copy link
Contributor

dfr commented Sep 15, 2023

Unless someone writes a FreeBSD compat layer for Linux, a Linux VM cannot run FreeBSD containers

For the especially interested: https://medium.com/nttlabs/linux-subsystem-for-freebsd-500b9a88fda4

That's impressive! I just wanted to note that the FreeBSD Linux emulation happens in the kernel which reduces the overhead to almost zero. Its also been around for more than 20 years by now and works very well for things that don't depend on linux-only things like cgroups or systemd.

@afbjorklund
Copy link
Contributor

Forgot to mention this feature during the "Podman 5.0" meeting, and it didn't seem to be high on the roadmap...

The team is busy preparing Fedora CoreOS support on macOS* and Windows*, so the OS part seems to be settled?

https://podman.io/community "Podman Community Cabal"

Currently Podman Desktop can't even run machine on Linux.


* for the Virtualization.framework and Hyper-V hypervisors, that is.

@davidchisnall
Copy link
Author

The team is busy preparing Fedora CoreOS support on macOS* and Windows*, so the OS part seems to be settled?

Nothing in this PR should conflict with that goal. The code paths for Fedora CoreOS on macOS and Windows are largely unaffected. The next commit in the series adds a small refactoring that moves some of the Linux-guest code into a separate function.

Currently Podman Desktop can't even run machine on Linux.

Without this patch series, there's little reason for it to want to: it can't run any guests other than Linux. This is the start of supporting other guest operating systems, which could be used to run FreeBSD containers on Linux (or Windows container on Linux if the user provides a suitably licensed Windows VM image).

With these patches, users on Macs now, and hopefully other systems later, get the ability to run FreeBSD containers on Mac, sharing files from the host. This seems like a useful feature for users. The cost of maintenance is very low: if you break it, tag me in an issue and I'll fix it. If I don't in a timely fashion, remove the feature.

@vrothberg
Copy link
Member

The cost of maintenance is very low: if you break it, tag me in an issue and I'll fix it.

I think that is hard to measure and subjective. An already stressed out maintainer may feel substantially more stressed if they had to deal with FreeBSD issues - may it only be tagging somebody else. It's a time investment in one way or another.

I desire to keep the support matrix of podman machine as small as possible. There is a number of "providers" on the supported "platforms" and adding "os" would suddenly double the (testing) matrix.

If I don't in a timely fashion, remove the feature.

That can reflect badly on the project.

I sympathize with the desire to also run FreeBSD in a podman machine but especially podman machine is still under very heavy development, so the costs of integrating code that maintainers do not understand are high.

@davidchisnall
Copy link
Author

In my experience, discouraging contributors is not a good way of increasing the pool of potential maintainers, but maybe you've had a different experience.

@giuseppe
Copy link
Member

is there any way we can make it not a burden for maintainers?

Having the possibility to play with different features upstream is something positive for the project, especially as there are people interested in maintaining it and we already have a FreeBSD port. As long as we make it clear that it is not supported.

@baude
Copy link
Member

baude commented Sep 25, 2023

@davidchisnall we do not want to discourage contributors and that is not the intention of anyone here, esp @vrothberg who is one of the more developer friendly folks we have ... i wonder if you would be willing to come talk to me or the team here so we can give this conversation the proper once-over it deserves.

@davidchisnall
Copy link
Author

i wonder if you would be willing to come talk to me or the team here so we can give this conversation the proper once-over it deserves.

I would be very happy to. What's your preferred mechanism? I'm in the UK, but free much of this week.

@dfr
Copy link
Contributor

dfr commented Sep 26, 2023

Would this be something suitable for the Podman Community Cabal meeting?

@TomSweeneyRedHat
Copy link
Member

We're always looking for topics for the cabal meetings. If you're willing to at least start the discussion, please feel free to add the topic and your name to the agenda: https://hackmd.io/gQCfskDuRLm7iOsWgH2yrg?both. The next meeting is Thursday, October 19, 2023, at 11:00 a.m. EDT (UTC-4)

@baude
Copy link
Member

baude commented Sep 29, 2023

@TomSweeneyRedHat could this be discussed in the upcoming community meeting?

@TomSweeneyRedHat
Copy link
Member

We could talk about it if we have time at the community meeting next week, but that one is generally slotted for demos. I'll add it to the tail end of the Agenda for Tuesday Oct 3, 2023 at 11:00 am EDT (UTC-4). Agenda and video call details here: https://hackmd.io/fc1zraYdS0-klJ2KJcfC7w

If we don't have time then, I'll for sure add it to the Cabal Meeting on the 19th of October.

@davidchisnall
Copy link
Author

I can give a short demo of running FreeBSD containers on a Mac, if that would be of interest.

@rhatdan
Copy link
Member

rhatdan commented Sep 29, 2023

SGTM

@emaste
Copy link

emaste commented Oct 3, 2023

One is required as a work around for FreeBSD not properly handling the ACPI shut down event from QEMU (this works on x86 so it’s probably a fairly small fix for AArch64) and should ideally be fixed there.

@davidchisnall do you know if there's a bug open for this?

@davidchisnall
Copy link
Author

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273756

@vrothberg
Copy link
Member

Just watched the recording of yesterdays community meeting as I had to leave early. Thanks for joining, @davidchisnall and @dfr! The idea of creating a FreeBSD "provider" sounds good to me. It would create a stronger separation of concerns and likely reduce the risk of interference between the various providers and platforms.

@TomSweeneyRedHat
Copy link
Member

And in case you are looking for the video, it's here on YouTube.

@afbjorklund
Copy link
Contributor

Just watched the recording of yesterdays community meeting

The idea of creating a FreeBSD "provider" sounds good to me.

Forking the provider based on the OS sounds strange to me, but Podman Machine and Lima takes different approaches...

The statement about a "CoreOS appliance" sums it up quite nicely, I think. Like the "Podman machine" vs. "Linux machine"

@packit-as-a-service
Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@davidchisnall
Copy link
Author

Based on the discussions in and after the call, I've made the following changes:

  • Removed the unused abstraction.
  • Fixed the missing docs.
  • Pushed all of the commits that are necessary for supporting FreeBSD guests with QEMU, so that the total amount of code disruption is visible.

As discussed after the end of the recording, adding a new provider feels like overkill because the Linux vs FreeBSD changes in the QEMU provider are tiny (30 new lines of code in the second commit, a few existing ones moved into a separate function, and five new ones in the third commit, which hopefully can go away once FreeBSD bug 273756 is fixed).

I'd expect the changes in other providers to be a similar size. I no longer have access to a Windows machine to test Hyper-V changes but I'd like to work on the AppleHV ones (I can't currently get that to work at all on my machine, with the Linux guest, but hopefully soon).

This is propagated through the VM initialisation but is currently
unused.  This is the first step towards properly supporting multiple
guest types (FreeBSD, Windows, and so on in addition to Linux).

Signed-off-by: David Chisnall <github@theravensnest.org>
Signed-off-by: David Chisnall <github@theravensnest.org>
FreeBSD ought to shut down when it receives the ACPI power-off event
from QEMU, but for some reason does not.  For now, work around this by
ssh'ing in and running the poweroff command explicitly.

Signed-off-by: David Chisnall <github@theravensnest.org>
@dfr
Copy link
Contributor

dfr commented Oct 8, 2023

It seems to me that we need two abstractions, one to cover VM lifecycle and one to cover the OS running in the VM, currently just needed to isolate the differences between FS mounts between CoreOS and FreeBSD.

These could be merged, I suppose but we would still need a way to share the VM management, perhaps having a qemu provider with no implementation of FS mount and a FreeBSD qemu (or CoreOS qemu) which delegates to qemu for VM stuff and adds OS support to complete the interface.

It might be cleaner to keep them separate which ends up being similar to David's original change but packaged up into a Go interface.

Copy link

github-actions bot commented Nov 8, 2023

A friendly reminder that this PR had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None stale-pr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants