Skip to content

Symlinks are discarded when creating CPIO archives #143

@paulnoalhyt

Description

@paulnoalhyt

When creating CPIO archives, symlinks are discarded.

How to reproduce?

I created this simple test:

import libarchive
from ctypes import c_void_p

def print_archive_info(test_name, archive_bytes):
    print(test_name)
    with libarchive.memory_reader(archive_bytes) as archive:
        for entry in archive:
            print(f"    entry.path: {entry.path}")
            print(f"    entry.size: {entry.size}")
            print(f"    entry.filetype: 0o{entry.filetype:o}")
            print(f"    entry.linkpath: {entry.linkpath}")

packed_chunks = []

def write_func(data):
    packed_chunks.append(bytes(data))
    return len(data)

path = "sbin"
linkpath = "usr/sbin"
entry_size = len(linkpath)

with libarchive.custom_writer(write_func, "cpio_newc") as archive:
    archive.add_file_from_memory(
        entry_path=path,
        entry_size=len(linkpath),
        entry_data=b"",
        filetype=libarchive.entry.FileType.SYMBOLINK_LINK,
        linkpath=linkpath.encode("utf-8"),
    )

packed_bytes = b"".join(packed_chunks)
print_archive_info("libarchive standard method:", packed_bytes)

# fix that works:
packed_chunks = []

with libarchive.custom_writer(write_func, "cpio_newc") as archive:
    new_entry = libarchive.entry.ArchiveEntry(
        pathname=path,
        size=len(linkpath),
        filetype=libarchive.entry.FileType.SYMBOLINK_LINK,
        linkpath=linkpath.encode("utf-8"),
    )
    libarchive.ffi.ffi('entry_set_symlink', [c_void_p], None)
    libarchive.ffi.entry_set_symlink(new_entry._entry_p, linkpath.encode("utf-8"))
    libarchive.ffi.write_header(archive._pointer, new_entry._entry_p)
    libarchive.ffi.write_finish_entry(archive._pointer)

packed_bytes = b"".join(packed_chunks)
print_archive_info("libarchive fix method:", packed_bytes)

Which outputs the following:

$ python3 libarchive_symlink.py
libarchive standard method:
    entry.path: sbin
    entry.size: 0
    entry.filetype: 0o120000
    entry.linkpath:
libarchive fix method:
    entry.path: sbin
    entry.size: 8
    entry.filetype: 0o120000
    entry.linkpath: usr/sbin

You can see that using add_file_from_memory with the linkpath argument, the resulting symlink size is 0, and the symlink doesn't actually point to anything.

Investigations

I spent some time investigated this. Here is what happens when using add_file_from_memory:

  • in the @linkpath.setter, ffi.entry_update_link_utf8(self._entry_p, value) is called
    • in libarchive, this corresponds to archive_entry_update_link_utf8. As some other comments hint: "Set symlink if symlink is already set, else set hardlink". So at this point it sets the file to a hardlink.
  • when the entry header is created, the CPIO writer runs into this in write_header:
	/* Non-regular files don't store bodies. */
	if (archive_entry_filetype(entry) != AE_IFREG)
		archive_entry_set_size(entry, 0);
  • then only if the file is a symlink, it will get its size and write the linkpath, etc
  • as we mentioned, our file is a hardlink at this point, so its size gets set to 0, and no linkpath is inserted

Fix

As you can see in the test script above, I found a fix for this issue: if we call ffi.entry_set_symlink before writing the header file, then the file is converted from hardlink to symlink, and the CPIO writer actually writes the right size and linkpath as you would expect.

This entry_set_symlink needs to be imported from the C libarchive library. entry_set_link_to_symlink is also available in the latest libarchive, which would avoid having to copy the linkpath again. But older libarchive binaries won't have this function (was testing on MacOS 15.7.1 and the system libarchive didn't have it), which is why I used entry_set_symlink.

I am not sure if this issue also affect other format writers other than CPIO. If so, I think the call to entry_set_symlink could happen at the end of @linkpath.setter. If this issue is specific to the CPIO writers, then a more targeted fix should be developed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions