Skip to content

FUSE: O(1) file copy via copy_file_range on /mfs (and /ipns) #11315

Description

@lidel

Problem

Today, cp /mfs/big-file /mfs/copy-of-big runs the kernel's userspace fallback: read every block out of FUSE, write every block back into FUSE. For a 4 GiB file this takes minutes despite kubo already having
every block in the blockstore.

The kernel offers a server-side fast path via the copy_file_range(2) syscall (Linux 5.0+, FUSE protocol v7.28+). When the FUSE server implements it, the kernel asks the server to perform the copy directly, avoiding userspace round-trips entirely. tools that use copy_file_range today: cp (coreutils 9+), rsync --inplace, parts of git, several editor "atomic save" implementations.

go-fuse exposes the hook via fs.NodeCopyFileRanger (added long ago; COPY_FILE_RANGE64 opcode landed in v2.10.1). kubo currently implements no NodeCopyFileRanger on any of its FUSE node types, so the bridge returns ENOSYS and the kernel falls back.

Proposal

Implement NodeCopyFileRanger on *writable.FileInode (covers /mfs and /ipns writable mounts).

For full-file copy (offIn==0 && offOut==0 && len >= sourceSize), short-circuit:

  1. Look up the source MFS node, get its current root CID.
  2. Create or replace the destination MFS entry pointing to the same CID.
  3. Return len as bytes copied.

This is O(1) regardless of file size: no block re-fetching, no chunker, no unixfs marshalling. Both files share the same DAG (which is already the natural state in IPFS, content-addressed deduplication for free).

For partial copies (offIn != 0 || len < sourceSize), return ENOSYS and let the kernel fall back to userspace read+write.

For cross-mount copies (source and destination on different FUSE roots), the kernel routes them as separate read+write to begin with, but defensively check that source and destination share an MFS root and return ENOSYS otherwise.

Why now

  • The v0.41 FUSE rewrite onto hanwen/go-fuse v2 makes the implementation straightforward; the v1 bazil.org/fuse path didn't have the right hook.
  • go-fuse v2.10.1 (the version kubo will ship next) added the COPY_FILE_RANGE64 opcode wiring, surfacing it cleanly.

Out of scope

  • /ipfs (read-only); copy_file_range source-side reads are fine but the kernel won't ask for write-side, so no change needed.
  • Reflink semantics on non-MFS filesystems (we're not implementing ioctl(FICLONE) on the underlying disk).
  • File deduplication beyond what content addressing already gives us.

Acceptance

  • time cp /mfs/4G-iso /mfs/4G-copy returns in well under a second (today: minutes).
  • The destination's CID matches the source's CID after copy.
  • cp from /mfs to a non-FUSE filesystem (e.g. /tmp) still works (kernel fallback path).
  • Sharness or test/cli coverage exercising the fast path.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium: Good to have, but can wait until someone steps upeffort/daysEstimated to take multiple days, but less than a weekexp/expertHaving worked on the specific codebase is importantkind/enhancementA net-new feature or improvement to an existing featuretopic/fuseTopic fuse

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions