Skip to content

ZFS ARC significantly exceeds zfs_arc_max during cp --reflink from snapshot to specific dataset in ZFS 2.3.2-1 #17424

Open
@Schuwi

Description

@Schuwi

I have found similar issues describing an unbounded growth of the ARC in ZFS 2.3.x but none specific to the reflink copy and my personal knowledge of ZFS' implementation is little to none, so I could not judge whether they are caused by the same underlying issue.

Transparency note: This issue was interactively composed with Gemini 2.5 Pro. It instructed me in running the preliminary debugging and collecting the following information, and compiled it into this summary form.

System information

Type Version/Name
Distribution Name Debian
Distribution Version 13
Kernel Version 6.1.0-31-amd64
Architecture amd64
OpenZFS Version zfs-2.3.2-1 (kmod: zfs-kmod-2.3.2-1)

Describe the problem you're observing

The ZFS ARC grows significantly beyond the configured zfs_arc_max limit (32 GiB) when performing a cp -r --reflink=auto operation. This issue has been specifically observed when copying from a directory within a ZFS snapshot (.zfs/snapshot/photos-pre-move/Photos/2019/* of dataset dpool/home/schuwi) to a particular target dataset (dpool/home/schuwi/Photos/2019). Both the source snapshot content and the target dataset reside on the same ZFS pool (dpool) which has the block_cloning feature enabled.

During the problematic reflink copy, the ARC size (reported by /proc/spl/kstat/zfs/arcstats) rapidly grows to over 50 GiB (from a 64 GiB total system RAM), far exceeding the 32 GiB zfs_arc_max limit. This leads to excessive memory consumption, severe system slowdown, and imminent risk of OOM killer intervention. If the operation is not terminated manually the OOM killer terminates most running applications until it catches the terminal running the cp operation, thus ending the copy operation and stopping memory usage from increasing.
The data_size component of the ARC alone is observed to exceed zfs_arc_max. High values for evict_skip and evict_not_enough are also present in arcstats during these operations, indicating the ARC is struggling to evict buffers.

Crucially, this issue exhibits specificity:

  • The issue does not occur when performing the same copy operation with cp -r --reflink=never. In this case, the ARC size remains stable around the configured zfs_arc_max.
  • The issue does not occur when using cp -r --reflink=auto to copy from the same snapshot source structure (e.g., .zfs/snapshot/photos-pre-move/Photos/2018/*) to a different target dataset (e.g., dpool/home/schuwi/Photos/2018). In this scenario, the ARC also respects the zfs_arc_max limit. Though the transfer speed is very slow, indicating that a normal transfer is happening, without taking advantage of the block_cloning feature.

This suggests an interaction specific to the source data from the photos-pre-move/Photos/2019/ directory within the snapshot and/or the state of the dpool/home/schuwi/Photos/2019 target dataset.

ZFS DKMS modules (zfs-2.3.2-1, zfs-kmod-2.3.2-1) are successfully built and installed for the current kernel.

Describe how to reproduce the problem

  1. Configure zfs_arc_max to 32 GiB (34359738368 bytes) in /etc/modprobe.d/zfs.conf. Reboot or ensure the setting is active.

    options zfs zfs_arc_max=34359738368
    
    
  2. Use a ZFS pool (e.g., dpool) with the block_cloning feature enabled.

    Bash

    zpool get feature@block_cloning dpool 
    # Expected: dpool  feature@block_cloning   active   local
    
    
  3. Have a source dataset, take a snapshot (e.g., dpool/home/schuwi@photos-pre-move). The source data for the copy will be a directory within this snapshot, e.g., /path/to/mountpoint/.zfs/snapshot/photos-pre-move/Photos/2019/*.

  4. Have a target dataset where the issue is observed, e.g., dpool/home/schuwi/Photos/2019.

  5. Attempt to copy the data from the specific snapshot directory to the specific target dataset using cp with reflink enabled. Assuming the command is run from /home/schuwi/ (which is on dpool/home/schuwi):

    Bash

    cp -r --reflink=auto .zfs/snapshot/photos-pre-move/Photos/2019/* Photos/2019/
    
    
  6. Monitor ARC usage via /proc/spl/kstat/zfs/arcstats and system memory (free -m, vmstat).

  7. Observe the ARC size and data_size rapidly grow beyond c_max.

Contrastive Test (Does NOT show the issue):

  • Copying from .zfs/snapshot/photos-pre-move/Photos/2018/* to a different target dataset, e.g., Photos/2018 (on dpool/home/schuwi/Photos/2018), using cp -r --reflink=auto does NOT exhibit this ARC overgrowth.

Include any warning/errors/backtraces from the system logs

No specific ZFS errors or backtraces appear in dmesg or journalctl during the ARC overgrowth, other than general system slowdown due to memory exhaustion.

System and ZFS Information:

$ lsb_release -is
Debian
$ lsb_release -rs
13
$ zfs version
zfs-2.3.2-1
zfs-kmod-2.3.2-1
$ uname -r
6.1.0-31-amd64

Pool and Dataset Information (Before problematic copy):

$ zpool status dpool
  pool: dpool
 state: ONLINE
  scan: scrub repaired 0B in 05:11:37 with 0 errors on Sun Jun  1 05:11:38 2025
config:
    NAME                                                      STATE     READ WRITE CKSUM
    dpool                                                     ONLINE       0     0     0
      ata-WDC_WD20EZRZ-22Z5HB0_WD-WCC4M0USZJEK                ONLINE       0     0     0
      ata-WDC_WD20EZRZ-22Z5HB0_WD-WCC4M0YVV5JK                ONLINE       0     0     0
    special   
      nvme-Samsung_SSD_980_PRO_1TB_S5GXNF0NC28024N-part5       ONLINE       0     0     0
errors: No known data errors

$ zpool get all dpool | grep block_cloning
dpool  feature@block_cloning         active                          local

$ zfs list -o name,used,avail,refer,mountpoint,primarycache,secondarycache dpool/home/schuwi@photos-pre-move dpool/home/schuwi/Photos/2019
NAME                               USED  AVAIL  REFER  MOUNTPOINT                 PRIMARYCACHE  SECONDARYCACHE
dpool/home/schuwi@photos-pre-move  29.1M      -  2.63T  -                          all           all
dpool/home/schuwi/Photos/2019       377G   287G   108G  /home/schuwi/Photos/2019  all           all

Dataset Information (After problematic copy was stopped manually to avoid process termination):

$ zfs list -o name,used,avail,refer,mountpoint,primarycache,secondarycache dpool/home/schuwi@photos-pre-move dpool/home/schuwi/Photos/2019
NAME                               USED  AVAIL  REFER  MOUNTPOINT                 PRIMARYCACHE  SECONDARYCACHE
dpool/home/schuwi@photos-pre-move  29.1M      -  2.63T  -                          all           all
dpool/home/schuwi/Photos/2019       432G   287G   162G  /home/schuwi/Photos/2019  all           all 

/etc/modprobe.d/zfs.conf:

options zfs zfs_arc_max=34359738368

Memory Usage During Issue:

$ free -m && vmstat 1 5
              total        used        free      shared  buff/cache   available
Mem:          64200       57016        7265          16         648        7183
Swap:             0           0           0
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st gu
 1  0      0 7425448      0 664016    0    0 31916 14915 2098    2  0  1 97  2  0  0
 1  0      0 4673012      0 672072    0    0 263368     0 11052 116137  0  7 93  0  0  0
 1  0      0 4276772      0 674000    0    0 237644 14284 8435 34033  0  8 91  1  0  0
 1  0      0 4549092      0 673696    0    0 239680 470628 40485 194893  0 16 83  1  0  0
 0  0      0 4826904      0 673696    0    0 266776     0 3424 5038  0  1 99  0  0  0

arcstats Data:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions