-
-
Couldn't load subscription status.
- Fork 40
Description
When creating CPIO archives, symlinks are discarded.
How to reproduce?
I created this simple test:
import libarchive
from ctypes import c_void_p
def print_archive_info(test_name, archive_bytes):
print(test_name)
with libarchive.memory_reader(archive_bytes) as archive:
for entry in archive:
print(f" entry.path: {entry.path}")
print(f" entry.size: {entry.size}")
print(f" entry.filetype: 0o{entry.filetype:o}")
print(f" entry.linkpath: {entry.linkpath}")
packed_chunks = []
def write_func(data):
packed_chunks.append(bytes(data))
return len(data)
path = "sbin"
linkpath = "usr/sbin"
entry_size = len(linkpath)
with libarchive.custom_writer(write_func, "cpio_newc") as archive:
archive.add_file_from_memory(
entry_path=path,
entry_size=len(linkpath),
entry_data=b"",
filetype=libarchive.entry.FileType.SYMBOLINK_LINK,
linkpath=linkpath.encode("utf-8"),
)
packed_bytes = b"".join(packed_chunks)
print_archive_info("libarchive standard method:", packed_bytes)
# fix that works:
packed_chunks = []
with libarchive.custom_writer(write_func, "cpio_newc") as archive:
new_entry = libarchive.entry.ArchiveEntry(
pathname=path,
size=len(linkpath),
filetype=libarchive.entry.FileType.SYMBOLINK_LINK,
linkpath=linkpath.encode("utf-8"),
)
libarchive.ffi.ffi('entry_set_symlink', [c_void_p], None)
libarchive.ffi.entry_set_symlink(new_entry._entry_p, linkpath.encode("utf-8"))
libarchive.ffi.write_header(archive._pointer, new_entry._entry_p)
libarchive.ffi.write_finish_entry(archive._pointer)
packed_bytes = b"".join(packed_chunks)
print_archive_info("libarchive fix method:", packed_bytes)
Which outputs the following:
$ python3 libarchive_symlink.py
libarchive standard method:
entry.path: sbin
entry.size: 0
entry.filetype: 0o120000
entry.linkpath:
libarchive fix method:
entry.path: sbin
entry.size: 8
entry.filetype: 0o120000
entry.linkpath: usr/sbin
You can see that using add_file_from_memory with the linkpath argument, the resulting symlink size is 0, and the symlink doesn't actually point to anything.
Investigations
I spent some time investigated this. Here is what happens when using add_file_from_memory:
- in the @linkpath.setter,
ffi.entry_update_link_utf8(self._entry_p, value)is called- in libarchive, this corresponds to archive_entry_update_link_utf8. As some other comments hint: "Set symlink if symlink is already set, else set hardlink". So at this point it sets the file to a hardlink.
- when the entry header is created, the CPIO writer runs into this in write_header:
/* Non-regular files don't store bodies. */
if (archive_entry_filetype(entry) != AE_IFREG)
archive_entry_set_size(entry, 0);
- then only if the file is a symlink, it will get its size and write the linkpath, etc
- as we mentioned, our file is a hardlink at this point, so its size gets set to
0, and no linkpath is inserted
Fix
As you can see in the test script above, I found a fix for this issue: if we call ffi.entry_set_symlink before writing the header file, then the file is converted from hardlink to symlink, and the CPIO writer actually writes the right size and linkpath as you would expect.
This entry_set_symlink needs to be imported from the C libarchive library. entry_set_link_to_symlink is also available in the latest libarchive, which would avoid having to copy the linkpath again. But older libarchive binaries won't have this function (was testing on MacOS 15.7.1 and the system libarchive didn't have it), which is why I used entry_set_symlink.
I am not sure if this issue also affect other format writers other than CPIO. If so, I think the call to entry_set_symlink could happen at the end of @linkpath.setter. If this issue is specific to the CPIO writers, then a more targeted fix should be developed.