-
Notifications
You must be signed in to change notification settings - Fork 126
Add a GitHub workflow running OpenBSD. #1405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
3405691582
wants to merge
1
commit into
swiftlang:main
Choose a base branch
from
3405691582:ci
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+70
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Background CI solutions are a well-known mechanism to ensuring that code gets exercised or tested and problems detected as early as possible. Swift uses a Jenkins-based solution, and if additional nodes are to be testing additional configurations, this must be done by the community, requiring careful integration with the rest of the existing CI infrastructure, complex software setups, large dependencies, and most importantly, costly cloud resources. GitHub workflows are an alternative mechanism that enjoys some support on Swift satellite projects. Crucially, these can use software containers to provide clean and reproducible environments with which to run code targeting a specific userspace distribution for a given kernel. Concretely, this means that a container host running Ubuntu can test in a Debian environment, for example, since the Debian container shares the same kernel as the Ubuntu host. This is complicated when we want to test platforms that do not share the same kernel. Cross-compilation is one approach, but is incomplete: for example, the target may have platform-specific runtime requirements that would not be exercised when cross-compiling. The obvious solution to reach for is virtualization: creating a container that is running a virtual machine with our alternate kernel. If we have a container that runs a virtual machine for our target containing a Swift toolchain, pass it the code that we have checked out, get back the results, then we can simply run that container as a GitHub workflow and achieve our goal. There are some difficulties in this, naturally. We need an operating system image and a regular scheme to provide the VM with inputs and outputs. While there has been a lot of advances in schemes such `virtio` to ensure data between the host and guest is transferred efficiently, specifically such as `virtio-9p-pci`, `virtio-vsock`, or even `virtio-pmem`, these require support from within the guest. Disk devices enjoy more robust guest support. While we can use Containerfiles to orchestrate the behavior of the Linux-based container running the virtual machine, we need a way to orchestrate the inner VM's behavior as well without requiring user input. The `cloud-init` instance initialization system, used with virtual machines running in cloud infrastructure, provides a natural solution for us here. The cloud-init system allows for virtual machines to be initialized from a HTTP server or local media (referred to as NoCloud). Running a HTTP server isolated to a container and scoped solely to a VM can be tedious to get right. Here, we use local media: if a virtual machine with cloud-init installed boots with a FAT or ISO9660 image labeled CIDATA and contains two files `meta-data` and `user-data`, cloud-init will use the data within to set up the instance as described in those files. The operating system running in the virtual machine ideally should have minimal dependencies, so that an accurate accounting of additional dependencies required can be made. cloud-init, however, has several dependencies, chiefly on Python. pyinstaller can be used to prepackage those dependencies into a single standalone binary. ## Implementation To run Swift inside of a VM in a GitHub workflow, we need the following: an operating system disk image, a pyinstaller prepared version of cloud-init, a toolchain, and the commands to run `swift build` or `swift test` against a Swift project. The toolchain and its dependencies could be installed by prepending extra package installation steps before running `swift` in the VM environment, but this requires time and network resources every time the VM executes. Ideally, we would like to have the operating system image already have preinstalled everything minimally required. As OpenBSD does not support newer virtio features to enhance passing data to and from the host to the guest VM, we need to use disk images. For input data, we have two options: providing the VM with a single disk that can be used for external inputs or additional scratch space, or supplying a read-only disk image with external inputs and a read-write image for scratch. The latter approach turns out to be more natural. The CIDATA volume must be either FAT or ISO9660, but FAT has many limitations, specifically on file names. If we want to share additional data on the volume, ISO9660 (with extensions) is more attractive, but is read-only inside the VM. Output data must be read by the Linux container. While OpenBSD and Linux both support e2fsprogs, which allows for minting ext2 volumes without mounting, extracting files from an ext2 image without mounting is more difficult, especially since containers cannot easily mount disks. Instead, we exploit tar archives: when the VM wants to transmit data back to the host, it writes the tape archive file directly to the virtualized disk drive. The host converts the disk drive image from qcow2 back to raw format and reads the disk image as an ordinary tape archive. Some care may be required to specify the correct disk image size to ensure proper format blocking. We thus need to ensure that we know which disk inside the VM corresponds to which disk outside the VM, so disks are specified in the same predictable order: OS image, "scratch" disk, "tape" disk, then CIDATA. The steps we want the VM to take need to occur automatically when the operating system boots. For OpenBSD, rc.firsttime(8) is already reserved to start cloud-init, but we also have rc.local(8) available for site-specific commands. This script runs during boot as root, which leads to some quirks, but this makes the cloud-init configuration simple: our commands need only be specified as an ordinary script in rc.local as part of the `user-data` cloud-init configuration: one that performs our necessary tasks, writes any necessary output to tape, and then powers off the VM. Since qemu will still exit successfully if any commands running in the VM fail, we write the exit code from within the VM to tape and exit the container explicitly with that exit code, to properly communicate success or failure as needed. I have already constructed the initial OpenBSD disk images, pyinstaller version of cloud-init, and a version of the Swift 6.2 toolchain, installed them alongside the toolchain's necessary dependencies, and prepared a disk image for use in the container in this commit, `openbsd-swift`. This container is Alpine Linux-based, chosen for low overhead, installed with qemu, qemu-img to create the additional scratch and tape disks, and xorriso installed, to create the ISO9660 image with cloud-init files and any other read-only inputs. When the container is run, the qemu wrapper script runs these tools to enforce the above requirements, run the VM optionally with KVM acceleration until it exits, then extracts the output tape. Thankfully, Linux GitHub runners are reported to consistently support KVM, which means that there is limited performance impact. The image is also configured with a volume mount point at `/usr/local/share/cidata` for potential use outside of this GitHub workflow; here, we initialize this volume from within the binary via environment variables, so that a volume nor additional files are necessary. It is important to note while this commit and approach is intended for OpenBSD, it is certainly not limited to it. This approach could be utilized for other platforms, and may even be more efficient if those platforms support alternate virtio features. ## Caveats The toolchain used in this image is still a work in progress. As the toolchain is refined and eventually upstreamed, this container can be updated without needing the workflow to change. The base image, pyinstaller, and toolchain containers will be described elsewhere; the `Containerfile` to create the `openbsd-swift` image may eventually end up on swiftlang/swift-docker. This means that for now, the container image is relatively opaque. The workflow does not yet exist on swiftlang/github-workflows; we use swift-testing as a pilot for this workflow before making it available more widely there. Part of the motivating factor for introducing this workflow is to detect platform support bugs faster. This however does mean that platform support bugs may still be present, and blocking commits unnecessarily may be undesirable. To mitigate, the workflow is configured to run only on demand, rather than triggering on every pull request. This workflow should be able to manually test pull requests by specifying the associated branch of the pull request when the workflow is invoked.
Contributor
|
I've asked @shahmishal to take a look as his team handles CI and GHA stuff. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
CI solutions are a well-known mechanism to ensuring that code gets exercised or tested and problems detected as early as possible. Swift uses a Jenkins-based solution, and if additional nodes are to be testing additional configurations, this must be done by the community, requiring careful integration with the rest of the existing CI infrastructure, complex software setups, large dependencies, and most importantly, costly cloud resources.
GitHub workflows are an alternative mechanism that enjoys some support on Swift satellite projects. Crucially, these can use software containers to provide clean and reproducible environments with which to run code targeting a specific userspace distribution for a given kernel. Concretely, this means that a container host running Ubuntu can test in a Debian environment, for example, since the Debian container shares the same kernel as the Ubuntu host. This is complicated when we want to test platforms that do not share the same kernel.
Cross-compilation is one approach, but is incomplete: for example, the target may have platform-specific runtime requirements that would not be exercised when cross-compiling. The obvious solution to reach for is virtualization: creating a container that is running a virtual machine with our alternate kernel. If we have a container that runs a virtual machine for our target containing a Swift toolchain, pass it the code that we have checked out, get back the results, then we can simply run that container as a GitHub workflow and achieve our goal.
There are some difficulties in this, naturally. We need an operating system image and a regular scheme to provide the VM with inputs and outputs. While there has been a lot of advances in schemes such
virtioto ensure data between the host and guest is transferred efficiently, specifically such asvirtio-9p-pci,virtio-vsock, or evenvirtio-pmem, these require support from within the guest. Disk devices enjoy more robust guest support.While we can use Containerfiles to orchestrate the behavior of the Linux-based container running the virtual machine, we need a way to orchestrate the inner VM's behavior as well without requiring user input. The
cloud-initinstance initialization system, used with virtual machines running in cloud infrastructure, provides a natural solution for us here. The cloud-init system allows for virtual machines to be initialized from a HTTP server or local media (referred to as NoCloud). Running a HTTP server isolated to a container and scoped solely to a VM can be tedious to get right. Here, we use local media: if a virtual machine with cloud-init installed boots with a FAT or ISO9660 image labeled CIDATA and contains two filesmeta-dataanduser-data, cloud-init will use the data within to set up the instance as described in those files.The operating system running in the virtual machine ideally should have minimal dependencies, so that an accurate accounting of additional dependencies required can be made. cloud-init, however, has several dependencies, chiefly on Python. pyinstaller can be used to prepackage those dependencies into a single standalone binary.
Implementation
To run Swift inside of a VM in a GitHub workflow, we need the following: an operating system disk image, a pyinstaller prepared version of cloud-init, a toolchain, and the commands to run
swift buildorswift testagainst a Swift project. The toolchain and its dependencies could be installed by prepending extra package installation steps before runningswiftin the VM environment, but this requires time and network resources every time the VM executes. Ideally, we would like to have the operating system image already have preinstalled everything minimally required.As OpenBSD does not support newer virtio features to enhance passing data to and from the host to the guest VM, we need to use disk images. For input data, we have two options: providing the VM with a single disk that can be used for external inputs or additional scratch space, or supplying a read-only disk image with external inputs and a read-write image for scratch. The latter approach turns out to be more natural. The CIDATA volume must be either FAT or ISO9660, but FAT has many limitations, specifically on file names. If we want to share additional data on the volume, ISO9660 (with extensions) is more attractive, but is read-only inside the VM.
Output data must be read by the Linux container. While OpenBSD and Linux both support e2fsprogs, which allows for minting ext2 volumes without mounting, extracting files from an ext2 image without mounting is more difficult, especially since containers cannot easily mount disks. Instead, we exploit tar archives: when the VM wants to transmit data back to the host, it writes the tape archive file directly to the virtualized disk drive. The host converts the disk drive image from qcow2 back to raw format and reads the disk image as an ordinary tape archive. Some care may be required to specify the correct disk image size to ensure proper format blocking.
We thus need to ensure that we know which disk inside the VM corresponds to which disk outside the VM, so disks are specified in the same predictable order: OS image, "scratch" disk, "tape" disk, then CIDATA.
The steps we want the VM to take need to occur automatically when the operating system boots. For OpenBSD, rc.firsttime(8) is already reserved to start cloud-init, but we also have rc.local(8) available for site-specific commands. This script runs during boot as root, which leads to some quirks, but this makes the cloud-init configuration simple: our commands need only be specified as an ordinary script in rc.local as part of the
user-datacloud-init configuration: one that performs our necessary tasks, writes any necessary output to tape, and then powers off the VM.Since qemu will still exit successfully if any commands running in the VM fail, we write the exit code from within the VM to tape and exit the container explicitly with that exit code, to properly communicate success or failure as needed.
I have already constructed the initial OpenBSD disk images, pyinstaller version of cloud-init, and a version of the Swift 6.2 toolchain, installed them alongside the toolchain's necessary dependencies, and prepared a disk image for use in the container in this commit,
openbsd-swift. This container is Alpine Linux-based, chosen for low overhead, installed with qemu, qemu-img to create the additional scratch and tape disks, and xorriso installed, to create the ISO9660 image with cloud-init files and any other read-only inputs. When the container is run, the qemu wrapper script runs these tools to enforce the above requirements, run the VM optionally with KVM acceleration until it exits, then extracts the output tape. Thankfully, Linux GitHub runners are reported to consistently support KVM, which means that there is limited performance impact.The image is also configured with a volume mount point at
/usr/local/share/cidatafor potential use outside of this GitHub workflow; here, we initialize this volume from within the binary via environment variables, so that a volume nor additional files are necessary.It is important to note while this commit and approach is intended for OpenBSD, it is certainly not limited to it. This approach could be utilized for other platforms, and may even be more efficient if those platforms support alternate virtio features.
Caveats
The toolchain used in this image is still a work in progress. As the toolchain is refined and eventually upstreamed, this container can be updated without needing the workflow to change. The base image, pyinstaller, and toolchain containers will be described elsewhere; the
Containerfileto create theopenbsd-swiftimage may eventually end up on swiftlang/swift-docker. This means that for now, the container image is relatively opaque.The workflow does not yet exist on swiftlang/github-workflows; we use swift-testing as a pilot for this workflow before making it available more widely there.
Part of the motivating factor for introducing this workflow is to detect platform support bugs faster. This however does mean that platform support bugs may still be present, and blocking commits unnecessarily may be undesirable. To mitigate, the workflow is configured to run only on demand, rather than triggering on every pull request. This workflow should be able to manually test pull requests by specifying the associated branch of the pull request when the workflow is invoked.
Pull request
Add a GitHub workflow running OpenBSD.
Motivation:
See above.
Modifications:
See above.
Checklist: