Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad latency reading large files from mounted volumes #6553

Closed
evandeaubl opened this issue Nov 26, 2022 · 4 comments · Fixed by #6575
Closed

Bad latency reading large files from mounted volumes #6553

evandeaubl opened this issue Nov 26, 2022 · 4 comments · Fixed by #6575
Assignees
Milestone

Comments

@evandeaubl
Copy link

Bug Report

Description

When I attempt to read large files (multiple GB) in volumes mounted into pods on a Talos cluster, the time to read just one byte out of the file is insanely long, and seems to scale with the size of the file. I've watched transfer metrics, and it looks like the entire file is getting read in at open time (!?!). None of the readahead settings I know of look out of whack (read_ahead_kb setting for the device is 128KB, filesystem readaheads look okay, but probably aren't relevant in light of the info in the next paragraph).

So far, it does not matter what filesystem is mounted, or what CSI the mount is using; I have replicated using RBD, CephFS, geesefs S3, OpenEBS lvm-localpv, and plain old built-in local volumes. hostPath mounts and local volume mounts on the system disk are the only mounts where I haven't been able to replicate this behavior.

I have not been able to replicate this issue when I installed the same version of k3s or k8s via kubeadm in the same environment, which is why I'm thinking this is a Talos issue.

Logs

Haven't found anything in logs that seems relevant, although when I add an strace to the second dd in the reproducer below, the long hang occurs during the openat() call opening the file from the mounted volume. Let me know if there are any logs you would like me to provide, but hopefully the reproducer is simple enough that you can replicate in your environment.

Environment

  • Talos version: 1.2.7
  • Kubernetes version: 1.25.4
  • Platform: metal

Reproducer

  1. Spin up a QEMU cluster using sudo --preserve-env=HOME talosctl cluster create --provisioner qemu --extra-disks 1 --extra-disks-size 20480.
  2. Deploy the PV/PVC/pod from the attached manifests.

pvc.yaml.txt
alpine.yaml.txt

  1. Run kubectl exec -it pod/alpine -- /bin/sh
  2. Run dd if=/dev/zero of=/data/zerofile bs=4M count=4096 to create a 16GB file.
  3. Run time dd if=/data/zerofile of=/dev/null bs=1K count=1 to read the first KB of that file.
  4. Observe that time to run second dd is very long.

Sample reproducer output (on my dev laptop with QEMU VMs running on NVME)

[evan@nitrogen test]$ kubectl exec -it pod/alpine -- /bin/sh
/ # dd if=/dev/zero of=/data/zerofile bs=4M count=4096
4096+0 records in
4096+0 records out
/ # time dd if=/data/zerofile of=/dev/null bs=1K count=1
1+0 records in
1+0 records out
real	0m 46.57s
user	0m 0.00s
sys	0m 46.50s
/ # 
@smira smira self-assigned this Nov 28, 2022
@smira smira added this to the v1.3 milestone Nov 28, 2022
@smira
Copy link
Member

smira commented Nov 30, 2022

I can reproduce this issue, and it doesn't seem to be anything Talos at the moment, it's a regular mount on the host.

Also interesting that it only happens for the first time, if you run the command once again, it succeeds immediately. It might be something ext4-related. Needs more digging.

@smira
Copy link
Member

smira commented Nov 30, 2022

At the time when it "hangs", it's blocked in the Linux kernel:

Image

@smira
Copy link
Member

smira commented Nov 30, 2022

Re-doing your reproducer with fsType: xfs in the PV "fixes" the problem:

# time dd if=/data/zerofile of=/dev/null bs=1K count=1
1+0 records in
1+0 records out
real	0m 0.00s
user	0m 0.00s
sys	0m 0.00s

So I think it's something ext4 specific coupled with slow I/O performance of the QEMU volume (talosctl cluster create was never optimized for performance, probably with .qcow volumes it would be way better).

@smira
Copy link
Member

smira commented Dec 1, 2022

okay, I know that this is, and thanks for reporting this bug!

smira added a commit to smira/talos that referenced this issue Dec 1, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Dec 1, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Dec 1, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Dec 1, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Dec 7, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
(cherry picked from commit d3cf061)
smira added a commit to smira/talos that referenced this issue Dec 20, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
(cherry picked from commit d3cf061)
smira added a commit to smira/talos that referenced this issue Dec 20, 2022
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
(cherry picked from commit d3cf061)
DJAlPee pushed a commit to DJAlPee/talos that referenced this issue May 22, 2023
Fixes siderolabs#6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants