Description
Describe the bug
It appears that a sync on an ext4 file in the guest does not result in the data on the host also being sync'd out to disk.
To Reproduce
Using the file tester fio
with the following config file:
[globa]
rw=write
ioengine=sync
bs=32k
direct=0
size=128m
numjobs=1
fsync_on_close=1
loops=10
When running this on the host to establish a baseline and running the linux perf
utility (with call graphs enabled), one sees on an ancient (android) 4.14 kernel many calls to sync
5.50% 0.00% fio [kernel.kallsyms] [k] sys_fsync
|
---sys_fsync
do_fsync
vfs_fsync_range
ext4_sync_file
file_write_and_wait_range
perf is run with
perf record -a --call-graph dwarf,16384 -F 200 -o /tmp/perf.data /tmp/fio /tmp/fio.job
Then the results with perf report
and perf script
.
When running the exact same fio
and config in the VM, the bandwidth reported by fio
is 4x that of the host baseline, indicating that the data is never actually sync'd to the medium. The page cache on the host is functioning as a huge disk cache. Running perf on the host during the execution of the fio test in the VM results shows no calls to sync.
The test is run against a virtio disk (/dev/vdb) that is formatted as an ext4 disk on the host, and then specified in the list of devices in the VMs config.json and mounted in the guest.
If perf is run in the VM (with a 5.7 kernel), then perf shows calls both to sync and the ext4 write call in the VM kernel:
20.56% 0.03% fio [kernel.kallsyms] [k] new_sync_write
|
--20.54%--new_sync_write
ext4_file_write_iter
so I do not believe it is a problem with fio.
It does not seem to matter what the VM config is. Firecracker is master as of 31 August 2020.
Expected behaviour
I would have expected that synchronous IO in the VM would be worse than on the host.
Environment
Firecracker v0.21.0-473-gf2f6d8f8
Host: Android 4.14
Guest: 5.7 kernel.org
rootfs: Alpine
Arch: arm64
perf version 4.14.133.g2e3484a
fio-3.16-32-g8c302
Running on an Android system with a 2 core A57 with a 14gig eMMC.
[ 0.490202] mmc0: new HS400 MMC card at address 0001
[ 0.491261] mmcblk0: mmc0:0001 S0J56X 14.8 GiB
Additional context
This is an embedded application, and because of power loss issues, data that is synchronously written in the guest needs to be resident on media.
I am running baseline perf numbers in both disk and network so that my masters can see how good firecracker is, opposed to all the other VM alternatives. 👍
I am a C programmer, and hardly know Rust. However, I wonder if there is any issue in src/devices/src/virtio/block/request.rs
because the write is against a memory object, yet the flush is against a disk object?
In the guest:
cat /proc/partitions
major minor #blocks name
254 0 32768 vda
254 16 614400 vdb
254 32 9736 vdc
/ # mount /dev/vdb /bar
[ 32.101662] EXT4-fs (vdb): recovery complete
[ 32.104363] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: (null)
/ # df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 27633 26978 0 100% /
devtmpfs 117700 0 117700 0% /dev
/dev/vdb 588352 177272 392648 31% /bar
Checks
- [x ] Have you searched the Firecracker Issues database for similar problems?
- [ x] Have you read the existing relevant Firecracker documentation?
- [ x] Are you certain the bug being reported is a Firecracker issue?