sysbox-fs high cpu usage, infinite unmount calls on shared device 

When I launch KASM container with sysbox with a GPU by sharing `--device=/dev/dri/renderD128`, sysbox-fs logs go crazy and it goes high CPU usage. I enabled logs and I see this 

```
time="2024-06-01 03:05:54" level=debug msg="Received umount syscall from pid 1098145"
time="2024-06-01 03:05:54" level=debug msg="target: /run/systemd/mount-rootfs/sys/devices/virtual, flags: 0x8, root: /, cwd: /"
time="2024-06-01 03:05:54" level=debug msg="Ignoring unmount of sysbox-fs managed submount at /run/systemd/mount-rootfs/sys/devices/virtual"
time="2024-06-01 03:05:54" level=debug msg="Received umount syscall from pid 1098145"
time="2024-06-01 03:05:54" level=debug msg="target: /run/systemd/mount-rootfs/sys, flags: 0x8, root: /, cwd: /"
time="2024-06-01 03:05:54" level=debug msg="Received umount syscall from pid 1098145"
time="2024-06-01 03:05:54" level=debug msg="target: /run/systemd/mount-rootfs/sys/devices/virtual, flags: 0x8, root: /, cwd: /"
time="2024-06-01 03:05:54" level=debug msg="Ignoring unmount of sysbox-fs managed submount at /run/systemd/mount-rootfs/sys/devices/virtual"
```

If I restart sysbox-fs service, this issue goes away temporarily on deployed containers (unable to docker exec the running containers afterwards), but if I deploy a new container, this issue again starts while sharing devices or somewhere else (?).

Any reason what causes the infinite loop of `/run/systemd/mount-rootfs/sys/devices/virtual` unmount call that goes away when sysbox-fs is restarted? 

Log File: [sysbox-fs.log](https://github.com/user-attachments/files/15522138/sysbox-fs.log)

(After some researching...)

I can see a lot of `"Received umount syscall from pid 1092497"` for different targets, and they seem to go perfectly.  I just searched for the first occurrence of `umount` in the log file, tracing every umount call.

```
time="2024-06-01 02:57:40" level=debug msg="Received umount syscall from pid 1092497"
time="2024-06-01 02:57:40" level=debug msg="target: /sys/fs/cgroup/unified, flags: 0x8, root: /, cwd: /var/labsdata"
time="2024-06-01 02:57:40" level=debug msg="Received mount syscall from pid 1092497"
time="2024-06-01 02:57:40" level=debug msg="source: cgroup2, target: /sys/fs/cgroup/unified, fstype: cgroup2, flags: 0xe, data: , root: /, cwd: /var/labsdata"
time="2024-06-01 02:57:40" level=debug msg="Received umount syscall from pid 1092497"
time="2024-06-01 02:57:40" level=debug msg="target: /sys/fs/cgroup/unified, flags: 0x8, root: /, cwd: /var/labsdata"
```

From line 7379 of log file we can see  the first occurance of umount call to `/run/systemd/mount-rootfs/sys/devices/virtual` that gets ignored, and from then on its just an infinite loop, for every container I deploy with a device, this just adds up and the log file is full of this messages, I have to turn off the debug log else its consuming lotta storage. This just don't stop, only if I pass the `--device=/dev/dri/renderD128`, and with the little knowledge I have, I am able to understand this infinite umount calls should be related to this device I passed, somehow causing an infinite loop.

```
time="2024-06-01 02:58:30" level=debug msg="Received umount syscall from pid 1098145"
time="2024-06-01 02:58:30" level=debug msg="target: /run/systemd/mount-rootfs/sys/devices/virtual, flags: 0x8, root: /, cwd: /"
time="2024-06-01 02:58:30" level=debug msg="Requested ReadDirAll() on directory /sys/kernel/mm/hugepages (req ID=0x1454)"
time="2024-06-01 02:58:30" level=debug msg="Executing ReadDirAll() for req-id: 0x1454, handler: SysKernel, resource: hugepages"
time="2024-06-01 02:58:30" level=debug msg="Ignoring unmount of sysbox-fs managed submount at /run/systemd/mount-rootfs/sys/devices/virtual"
```

I went through the code located at https://github.com/nestybox/sysbox-fs/blob/master/nsenter/utils.go - this file has a potential possibility to go on a cleanup loop that could repeatedly send unmount calls, that later gets ignored by seccomp, as shown in the log, from here: https://github.com/nestybox/sysbox-fs/blob/4c2bc153f33af1bd30a227a14ecfc8174ff280d5/seccomp/umount.go#L128


Can we skip these devices from unmounting that are for sure going to get ignored by seccomp thus saving lot of CPU? Is my understanding of whats going on is correct?  If so, how to solve this issue? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sysbox-fs high cpu usage, infinite unmount calls on shared device #808

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sysbox-fs high cpu usage, infinite unmount calls on shared device #808

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions