Description
I have found similar issues describing an unbounded growth of the ARC in ZFS 2.3.x but none specific to the reflink copy and my personal knowledge of ZFS' implementation is little to none, so I could not judge whether they are caused by the same underlying issue.
Transparency note: This issue was interactively composed with Gemini 2.5 Pro. It instructed me in running the preliminary debugging and collecting the following information, and compiled it into this summary form.
System information
Type | Version/Name |
---|---|
Distribution Name | Debian |
Distribution Version | 13 |
Kernel Version | 6.1.0-31-amd64 |
Architecture | amd64 |
OpenZFS Version | zfs-2.3.2-1 (kmod: zfs-kmod-2.3.2-1 ) |
Describe the problem you're observing
The ZFS ARC grows significantly beyond the configured zfs_arc_max
limit (32 GiB) when performing a cp -r --reflink=auto
operation. This issue has been specifically observed when copying from a directory within a ZFS snapshot (.zfs/snapshot/photos-pre-move/Photos/2019/*
of dataset dpool/home/schuwi
) to a particular target dataset (dpool/home/schuwi/Photos/2019
). Both the source snapshot content and the target dataset reside on the same ZFS pool (dpool
) which has the block_cloning
feature enabled.
During the problematic reflink copy, the ARC size
(reported by /proc/spl/kstat/zfs/arcstats
) rapidly grows to over 50 GiB (from a 64 GiB total system RAM), far exceeding the 32 GiB zfs_arc_max
limit. This leads to excessive memory consumption, severe system slowdown, and imminent risk of OOM killer intervention. If the operation is not terminated manually the OOM killer terminates most running applications until it catches the terminal running the cp
operation, thus ending the copy operation and stopping memory usage from increasing.
The data_size
component of the ARC alone is observed to exceed zfs_arc_max
. High values for evict_skip
and evict_not_enough
are also present in arcstats
during these operations, indicating the ARC is struggling to evict buffers.
Crucially, this issue exhibits specificity:
- The issue does not occur when performing the same copy operation with
cp -r --reflink=never
. In this case, the ARC size remains stable around the configuredzfs_arc_max
. - The issue does not occur when using
cp -r --reflink=auto
to copy from the same snapshot source structure (e.g.,.zfs/snapshot/photos-pre-move/Photos/2018/*
) to a different target dataset (e.g.,dpool/home/schuwi/Photos/2018
). In this scenario, the ARC also respects thezfs_arc_max
limit. Though the transfer speed is very slow, indicating that a normal transfer is happening, without taking advantage of theblock_cloning
feature.
This suggests an interaction specific to the source data from the photos-pre-move/Photos/2019/
directory within the snapshot and/or the state of the dpool/home/schuwi/Photos/2019
target dataset.
ZFS DKMS modules (zfs-2.3.2-1
, zfs-kmod-2.3.2-1
) are successfully built and installed for the current kernel.
Describe how to reproduce the problem
-
Configure
zfs_arc_max
to 32 GiB (34359738368
bytes) in/etc/modprobe.d/zfs.conf
. Reboot or ensure the setting is active.options zfs zfs_arc_max=34359738368
-
Use a ZFS pool (e.g.,
dpool
) with theblock_cloning
feature enabled.Bash
zpool get feature@block_cloning dpool # Expected: dpool feature@block_cloning active local
-
Have a source dataset, take a snapshot (e.g.,
dpool/home/schuwi@photos-pre-move
). The source data for the copy will be a directory within this snapshot, e.g.,/path/to/mountpoint/.zfs/snapshot/photos-pre-move/Photos/2019/*
. -
Have a target dataset where the issue is observed, e.g.,
dpool/home/schuwi/Photos/2019
. -
Attempt to copy the data from the specific snapshot directory to the specific target dataset using
cp
with reflink enabled. Assuming the command is run from/home/schuwi/
(which is ondpool/home/schuwi
):Bash
cp -r --reflink=auto .zfs/snapshot/photos-pre-move/Photos/2019/* Photos/2019/
-
Monitor ARC usage via
/proc/spl/kstat/zfs/arcstats
and system memory (free -m
,vmstat
). -
Observe the ARC
size
anddata_size
rapidly grow beyondc_max
.
Contrastive Test (Does NOT show the issue):
- Copying from
.zfs/snapshot/photos-pre-move/Photos/2018/*
to a different target dataset, e.g.,Photos/2018
(ondpool/home/schuwi/Photos/2018
), usingcp -r --reflink=auto
does NOT exhibit this ARC overgrowth.
Include any warning/errors/backtraces from the system logs
No specific ZFS errors or backtraces appear in dmesg
or journalctl
during the ARC overgrowth, other than general system slowdown due to memory exhaustion.
System and ZFS Information:
$ lsb_release -is
Debian
$ lsb_release -rs
13
$ zfs version
zfs-2.3.2-1
zfs-kmod-2.3.2-1
$ uname -r
6.1.0-31-amd64
Pool and Dataset Information (Before problematic copy):
$ zpool status dpool
pool: dpool
state: ONLINE
scan: scrub repaired 0B in 05:11:37 with 0 errors on Sun Jun 1 05:11:38 2025
config:
NAME STATE READ WRITE CKSUM
dpool ONLINE 0 0 0
ata-WDC_WD20EZRZ-22Z5HB0_WD-WCC4M0USZJEK ONLINE 0 0 0
ata-WDC_WD20EZRZ-22Z5HB0_WD-WCC4M0YVV5JK ONLINE 0 0 0
special
nvme-Samsung_SSD_980_PRO_1TB_S5GXNF0NC28024N-part5 ONLINE 0 0 0
errors: No known data errors
$ zpool get all dpool | grep block_cloning
dpool feature@block_cloning active local
$ zfs list -o name,used,avail,refer,mountpoint,primarycache,secondarycache dpool/home/schuwi@photos-pre-move dpool/home/schuwi/Photos/2019
NAME USED AVAIL REFER MOUNTPOINT PRIMARYCACHE SECONDARYCACHE
dpool/home/schuwi@photos-pre-move 29.1M - 2.63T - all all
dpool/home/schuwi/Photos/2019 377G 287G 108G /home/schuwi/Photos/2019 all all
Dataset Information (After problematic copy was stopped manually to avoid process termination):
$ zfs list -o name,used,avail,refer,mountpoint,primarycache,secondarycache dpool/home/schuwi@photos-pre-move dpool/home/schuwi/Photos/2019
NAME USED AVAIL REFER MOUNTPOINT PRIMARYCACHE SECONDARYCACHE
dpool/home/schuwi@photos-pre-move 29.1M - 2.63T - all all
dpool/home/schuwi/Photos/2019 432G 287G 162G /home/schuwi/Photos/2019 all all
/etc/modprobe.d/zfs.conf
:
options zfs zfs_arc_max=34359738368
Memory Usage During Issue:
$ free -m && vmstat 1 5
total used free shared buff/cache available
Mem: 64200 57016 7265 16 648 7183
Swap: 0 0 0
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
r b swpd free buff cache si so bi bo in cs us sy id wa st gu
1 0 0 7425448 0 664016 0 0 31916 14915 2098 2 0 1 97 2 0 0
1 0 0 4673012 0 672072 0 0 263368 0 11052 116137 0 7 93 0 0 0
1 0 0 4276772 0 674000 0 0 237644 14284 8435 34033 0 8 91 1 0 0
1 0 0 4549092 0 673696 0 0 239680 470628 40485 194893 0 16 83 1 0 0
0 0 0 4826904 0 673696 0 0 266776 0 3424 5038 0 1 99 0 0 0
arcstats
Data:
- arcstats_before.txt: Content of
/proc/spl/kstat/zfs/arcstats
before the copy - arcstats_during_reflink_issue.txt: Content of
/proc/spl/kstat/zfs/arcstats
when ARC size >c_max
- arcstats_after.txt: Content of
/proc/spl/kstat/zfs/arcstats
after stopping the copy and waiting 1~2 minutes