Skip to content

Commit 34955a9

Browse files
committed
docs on creating copy-on-write filesystems
Signed-off-by: Tobias Pfandzelter <pfandzelter@campus.tu-berlin.de>
1 parent bf0cd5f commit 34955a9

File tree

1 file changed

+218
-0
lines changed

1 file changed

+218
-0
lines changed

docs/overlay-filesystem.md

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
# Let Multiple Firecracker VMs Share a Root Filesystem with Copy-on-Write
2+
3+
An overlay (copy-on-write) filesystem lets multiple microVMs share a common read-only
4+
filesystem on the host. Each microVM can still write changes to that filesystem
5+
by using its own overlay. By default, files are read from the underlying root filesystem.
6+
All changes are written to the overlay by copying the file and writing the modified
7+
copy. If such a copy exists on the overlay, it takes precedence over whatever is
8+
in the root filesystem.
9+
10+
As used by [`firecracker-containerd`](https://github.com/firecracker-microvm/firecracker-containerd),
11+
this requires a root filesystem in `squashfs` mounted as read-only and a write-layer
12+
formatted as `ext4`, which can be either a temporary `tempfs` in guest memory or
13+
a sparse `ext4` file on the host. The latter method has the advantage that changes
14+
can be persisted across microVM reboots if required.
15+
16+
Please note that this requires changes on the guest and is thus only possible
17+
if you control the guest's init.
18+
19+
## Convert rootfs to squashfs
20+
21+
If you already have an existing `rootfs` file formatted as `ext4`, e.g., created
22+
according to the [rootfs-and-kernel-setup](https://github.com/firecracker-microvm/firecracker/blob/main/docs/rootfs-and-kernel-setup.md)
23+
documentation, you can simply mount it and create a new `squashfs` formatted filesystem
24+
from that.
25+
26+
This requires `mksquashfs`, which is available as part of the `squashfs-tools`
27+
for you distribution.
28+
29+
1. Create a mounting point
30+
31+
```bash
32+
mkdir /tmp/my-rootfs
33+
```
34+
35+
1. Mount the existing rootfs (e.g., `rootfs.ext4`). If you don't have an existing
36+
rootfs, you can skip this step and simply copy your files directly.
37+
38+
```bash
39+
sudo mount rootfs.ext4 /tmp/my-rootfs
40+
```
41+
42+
1. Create necessary folders for mounting the overlay filesystem. These mount points
43+
have to be created now as the microVM will not be able to change anything on
44+
the read-only filesystem.
45+
46+
```bash
47+
sudo mkdir -p /tmp/my-rootfs/overlay/root \
48+
/tmp/my-rootfs/overlay/work \
49+
/tmp/my-rootfs/mnt \
50+
/tmp/my-rootfs/rom
51+
```
52+
53+
1. Create the `overlay-init` script (adapted from [overlay-init of firecracker-containerd](https://github.com/firecracker-microvm/firecracker-containerd/blob/main/tools/image-builder/files_debootstrap/sbin/overlay-init)).
54+
55+
```bash
56+
cat > overlay-init <<EOF
57+
#!/bin/sh
58+
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved
59+
#
60+
# Licensed under the Apache License, Version 2.0 (the "License"). You may
61+
# not use this file except in compliance with the License. A copy of the
62+
# License is located at
63+
#
64+
# <http://aws.amazon.com/apache2.0/>
65+
#
66+
# or in the "license" file accompanying this file. This file is distributed
67+
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
68+
# express or implied. See the License for the specific language governing
69+
# permissions and limitations under the License
70+
71+
# Parameters
72+
# 1. rw_root -- path where the read/write root is mounted
73+
# 2. work_dir -- path to the overlay workdir (must be on same filesystem as rw_root)
74+
# Overlay will be set up on /mnt, original root on /mnt/rom
75+
pivot() {
76+
local rw_root work_dir
77+
rw_root="$1"
78+
work_dir="$2"
79+
/bin/mount \
80+
-o noatime,lowerdir=/,upperdir=${rw_root},workdir=${work_dir} \
81+
-t overlay "overlayfs:${rw_root}" /mnt
82+
pivot_root /mnt /mnt/rom
83+
}
84+
85+
# Overlay is configured under /overlay
86+
# Global variable $overlay_root is expected to be set to either
87+
# "ram", which configures a tmpfs as the rw overlay layer (this is
88+
# the default, if the variable is unset)
89+
# - or -
90+
# A block device name, relative to /dev, in which case it is assumed
91+
# to contain an ext4 filesystem suitable for use as a rw overlay
92+
# layer. e.g. "vdb"
93+
do_overlay() {
94+
local overlay_dir="/overlay"
95+
if [ "$overlay_root" = ram ] ||
96+
[ -z "$overlay_root" ]; then
97+
/bin/mount -t tmpfs -o noatime,mode=0755 tmpfs /overlay
98+
else
99+
/bin/mount -t ext4 "/dev/$overlay_root" /overlay
100+
fi
101+
mkdir -p /overlay/root /overlay/work
102+
pivot /overlay/root /overlay/work
103+
}
104+
105+
# If we're given an overlay, ensure that it really exists. Panic if not
106+
if [ -n "$overlay_root" ] &&
107+
[ "$overlay_root" != ram ] &&
108+
[ ! -b "/dev/$overlay_root" ]; then
109+
echo -n "FATAL: "
110+
echo -n "Overlay root given as $overlay_root but "
111+
echo "/dev/$overlay_root does not exist"
112+
exit 1
113+
fi
114+
115+
do_overlay
116+
117+
# invoke the actual system init program and procede with the boot
118+
# process
119+
exec /sbin/init $@
120+
EOF
121+
122+
sudo cp overlay-init /tmp/my-rootfs/sbin/overlay-init
123+
```
124+
125+
1. Create a `squashfs` formatted filesystem
126+
127+
```bash
128+
sudo mksquashfs /tmp/my-rootfs rootfs.img -noappend
129+
```
130+
131+
1. Unmount the old rootfs (if mounted in step 2).
132+
133+
```bash
134+
sudo umount /tmp/my-rootfs
135+
```
136+
137+
Now we have successfully prepared the rootfs.
138+
139+
## Creating an ext4 Formatted Persistent Overlay
140+
141+
To allow microVMs to save persistent files that are available after a reboot, we
142+
need to create an `ext4` image to use as an overlay. If data does not need to be
143+
available again after a reboot, you can skip this step, as it is possible to use
144+
an in-memory `tmpfs` as an overlay instead.
145+
146+
1. Create the image file. We will use a size of 1 GiB (1024 MiB), but this can
147+
be increased.
148+
149+
```bash
150+
dd if=/dev/zero of=overlay.ext4 conv=sparse bs=1M count=1024
151+
```
152+
153+
The file will be created as a sparse file, so that it only uses as much disk
154+
space as it currently needs. The file size may still be reported as 1 GiB
155+
(the file's _apparent size_). Note that this requires your host filesystem
156+
to support sparse files. Its actual size can be checked with the following
157+
command (which should be 0 right now):
158+
159+
```bash
160+
du -h overlay.ext4
161+
```
162+
163+
`du` can also be used to report the apparent size of a file (1GiB in this
164+
example):
165+
166+
```bash
167+
du -h --apparent-size overlay.ext4
168+
```
169+
170+
1. Create an `ext4` file system on the image file.
171+
172+
```bash
173+
mkfs.ext4 overlay.ext4
174+
```
175+
176+
Done! The overlay is ready now. Note that you need to create **one filesystem per
177+
microVM**.
178+
179+
## Configure the rootfs and Kernel Boot Parameters
180+
181+
To actually use the overlay filesystem correctly, you will need to adapt your Firecracker
182+
configuration and boot parameters for you microVMs.
183+
184+
First, mount the new `squashfs` root filesystem as read-only. Note that this step
185+
is optional but recommended. Simply set the `is_read_only` parameter in your Firecracker
186+
disk parameters to `true` for the root device.
187+
188+
Second, set the `init` parameter to `/sbin/overlay-init` to execute the initalization
189+
of our overlay filesystem before starting the rest of the microVM's init process.
190+
If you set the `overlay_root` to `ram` or leave it unset, a `tmpfs` will be created
191+
and used as the write layer. Otherwise, add the `overlay.ext4` as a second drive
192+
and set `overlay_root` to `vdb` (or mount it as a third drive and set to `vdc`, etc.).
193+
194+
```json
195+
{
196+
"boot-source": {
197+
"kernel_image_path": "vmlinux",
198+
"boot_args": "console=ttyS0 reboot=k panic=1 pci=off overlay_root=vdb init=/sbin/overlay-init",
199+
},
200+
"drives": [
201+
{
202+
"drive_id": "rootfs",
203+
"path_on_host": "rootfs.img",
204+
"is_root_device": true,
205+
"is_read_only": true,
206+
},
207+
{
208+
"drive_id": "overlayfs",
209+
"path_on_host": "overlay.ext4",
210+
"is_root_device": false,
211+
}
212+
],
213+
"machine-config": {
214+
"vcpu_count": 2,
215+
"mem_size_mib": 1024,
216+
},
217+
}
218+
```

0 commit comments

Comments
 (0)