Skip to content

Broken Hash Calculation for Embedded Container Images #260

@streaky

Description

@streaky

When embedding container images, the process fails with errors during hash generation for our custom container images. The error output includes messages like:

sha256sum: images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/command/launcher: No such file or directory
sha256sum: manifest.xml: No such file or directory
sha256sum: images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/script: No such file or directory
sha256sum: '(dev).tmpl': No such file or directory

The script computes a hash for the container image by listing all files and then hashing the files. The current implementation uses this pipeline:

find "${container_base_dir}/${container_dir}" -type f -print | LC_ALL=C sort | xargs sha256sum | sha256sum | cut -d' ' -f1

While this approach works fine for many standard images, it fails with [all of] our custom container images. The failure indicates that valid file paths are being misinterpreted in some cases.

Explanation:

Using find -print produces newline-delimited output, and passing that output to xargs causes it to split on any whitespace. This splitting misinterprets a single file path as multiple tokens, leading to calls to sha256sum with incorrect file names and, ultimately, build failures.

For example, the file name images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/command/launcher manifest.xml is incorrectly passed to sha256sum as two seperate files, images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/command/launcher and manifest.xml

Proposed Fix:

To resolve the issue, the file enumeration should use a null-terminated approach that safely preserves file names regardless of any whitespace or special characters. This can be achieved by modifying the pipeline as follows:

container_files_hash="$(
  find "${container_base_dir}/${container_dir}" -type f -print0 | \
  LC_ALL=C sort -z | \
  xargs -0 sha256sum | \
  sha256sum | \
  cut -d' ' -f1
)"

-print0 and -0 Options: These ensure file names are treated as complete, atomic strings.
LC_ALL=C sort -z: This guarantees a consistent, bytewise sort order while preserving the null termination.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions