Description
System information
Type | Version/Name |
---|---|
Distribution Name | Debian |
Distribution Version | Bookworm (12) |
Kernel Version | 6.1.0-13-amd64 |
Architecture | x86_64 |
OpenZFS Version | zfs-2.1.14-1~bpo12+1 |
Describe the problem you're observing
Since upgrading to Debian 12 a small system I run as rsync target and rsnapshot host went from a flat 50% RAM usage (out of 2GB total) to constantly OOM'ing even after an upgrade to 8GB. I initially raised the issue as question in #14986 but had too many variables in place to precisely pin the source of the issue (various external processes writing on it, some old zpools which used to have dedup enabled, a ton of snapshots, etc).
I've now managed to reproduce what I think is some sort of memory leak on a simple/vanilla setup, described below, by only using a single data source and rsnapshot. At this point though I'm at a loss and can't go any deeper - any help would be much appreciated.
Due to the (relative) simplicity of the setup I managed to replicate this on, I'm really surprised it's only me reporting.
Describe how to reproduce the problem
1 - I have a sample dataset composed as follows on a remote host 100 GB size, 100k directories, 250k files
2 - The zpool (hosted on a 4 vCPU, 8GB RAM VM) is relatively simple (zpool create -m /mnt/tank tank /dev/sda
) and the FS on top of it is zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase -o compression=on tank/secure
(note both compression and encryption are on)
3 - rsnapshot is configured to backup the remote (see "1") to /mnt/tank/secure
4 - The source is never changed across iterations, so the rsync step of rsnapshot doesn't actually sync anything
5 - I run 10 iterations back to back, and see ARC filling up to its target 3/4GB size, as one would expect. Additionally, every iteration leaves about 600mb of "abandoned" used RAM
Here is what I'm left with:
total used free shared buff/cache available
Mem: 7951 7140 462 0 625 810
Swap: 0 0 0
Active / Total Objects (% used) : 13869528 / 15195981 (91.3%)
Active / Total Slabs (% used) : 629764 / 629764 (100.0%)
Active / Total Caches (% used) : 144 / 207 (69.6%)
Active / Total Size (% used) : 5961935.63K / 6291859.64K (94.8%)
Minimum / Average / Maximum Object : 0.01K / 0.41K / 16.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
1259468 1259468 100% 1.12K 44981 28 1439392K zfs_znode_cache
1262688 1262688 100% 0.97K 78918 16 1262688K dnode_t
1614144 1358737 84% 0.50K 100884 16 807072K kmalloc-512
43460 43460 100% 16.00K 21730 2 695360K zio_buf_comb_16384
1387428 1351895 97% 0.38K 66068 21 528544K dmu_buf_impl_t
2478084 2024503 81% 0.19K 118004 21 472016K dentry
39580 39580 100% 8.00K 9895 4 316640K kmalloc-8k
1259472 1259472 100% 0.24K 78717 16 314868K sa_cache
The same test, on the same system, with an ext4 over luks as destination works as intended and only results in 600mb of RAM used in total.
Include any warning/errors/backtraces from the system logs
Nothing relevant I can spot.