Skip to content

image/docker/internal/tarfile: Fix size reporting for symlink layers#605

Open
1seal wants to merge 1 commit intocontainers:mainfrom
1seal:fix-tarfile-symlink-layer-size
Open

image/docker/internal/tarfile: Fix size reporting for symlink layers#605
1seal wants to merge 1 commit intocontainers:mainfrom
1seal:fix-tarfile-symlink-layer-size

Conversation

@1seal
Copy link

@1seal 1seal commented Jan 26, 2026

This fixes a mismatch between the size returned by docker/internal/tarfile.Source.GetBlob and the bytes returned when reading docker-archive layers that are stored as symlink entries
(e.g. produced by docker save for duplicate layers).

Previously, prepareLayerData recorded the symlink header size (often 0), while GetBlob followed the symlink and returned the target file bytes.

Changes:

  • Resolve a single symlink when determining a layer's size (matching the Reader behavior used by GetBlob).
  • Add a regression test ensuring "reported size == bytes returned" for a symlink-referenced layer.

Tests:

  • go test ./...
  • go vet ./...

@packit-as-a-service
Copy link

Packit jobs failed. @containers/packit-build please check.

2 similar comments
@packit-as-a-service
Copy link

Packit jobs failed. @containers/packit-build please check.

@packit-as-a-service
Copy link

Packit jobs failed. @containers/packit-build please check.

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Note 785583c#sign-your-prs:

Use a real name (sorry, no anonymous contributions).

there’s nuance to that, please read on there.

layerPath := path.Clean(h.Name)
// FIXME: Cache this data across images in Reader.
if li, ok := unknownLayerSizes[layerPath]; ok {
underlyingReader := io.Reader(t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moving the contents of this if into a separate function would be easier.

Notably we can get rid of underlyingCloser and just use defer, much safer against the risk of forgetting to do that — and we would not have to stop using defer uncompressedStream.Close.

Even underlyingSize could be eliminated, in favor of h = resolvedHeader.

// It is safe to call this method from multiple goroutines simultaneously.
// The caller should call .Close() on the returned stream.
func (r *Reader) openTarComponent(componentPath string) (io.ReadCloser, error) {
func (r *Reader) openTarComponentWithHeader(componentPath string) (io.ReadCloser, *tar.Header, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openTarComponent has 2 callers; I think just having one function, adding a return value, and ignoring it in the 2 callers would be fine enough. It’s not worth having 2 version of openTarComponentWithHeader (and documenting the differences).


var tarfileBuffer bytes.Buffer
tw := tar.NewWriter(&tarfileBuffer)
require.NoError(t, tw.WriteHeader(&tar.Header{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the code that actually does something as separate statements from require, throughout.

Comment on lines 89 to 111
require.NoError(t, tw.WriteHeader(&tar.Header{
Name: "manifest.json",
Mode: 0o644,
Size: int64(len(manifestBytes)),
}))
_, err = tw.Write(manifestBytes)
require.NoError(t, err)

require.NoError(t, tw.WriteHeader(&tar.Header{
Name: "config.json",
Mode: 0o644,
Size: int64(len(configBytes)),
}))
_, err = tw.Write(configBytes)
require.NoError(t, err)

require.NoError(t, tw.WriteHeader(&tar.Header{
Name: "layer.tar",
Mode: 0o644,
Size: int64(len(layerBytes)),
}))
_, err = tw.Write(layerBytes)
require.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be a loop over an array instead of repetitive code.

reader, err := NewReaderFromStream(nil, &tarfileBuffer)
require.NoError(t, err)
src := NewSource(reader, true, "transport name", nil, -1)
t.Cleanup(func() { _ = src.Close() })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not defer?

When a layer path in a docker-archive points at a symlink header, prepareLayerData recorded the symlink header size (often 0) while GetBlob followed the symlink and returned the target file bytes.

Resolve a single symlink when determining the layer size so the reported size matches the bytes returned by GetBlob, and add a regression test.

Signed-off-by: Oleh Konko <security@1seal.org>
@1seal 1seal force-pushed the fix-tarfile-symlink-layer-size branch from 785583c to 048739e Compare January 26, 2026 19:13
@1seal
Copy link
Author

1seal commented Jan 26, 2026

thanks for the detailed review

  • dco / real name: fixed. i respun the commit so the author and Signed-off-by use my real name: Oleh Konko security@1seal.org

  • image/docker/internal/tarfile/src.go: refactored the size-determination logic into a helper (uncompressedLayerSize(...)). this keeps per-layer defer safe (no accumulation across the scan loop), allows using defer uncompressedStream.Close() again, and avoids manual close bookkeeping. in the symlink case the helper resolves the target and effectively uses the resolved header (so the non-compressed case uses the correct .Size).

  • image/docker/internal/tarfile/reader.go: simplified per your suggestion - there is now a single openTarComponent that also returns the (resolved) header, and callers ignore it when not needed.

  • image/docker/internal/tarfile/src_test.go: cleaned up style:

  • separated side-effecting statements from require.NoError

  • reduced repetition via a small loop over entries

  • switched to defer src.Close().

tests: go test ./... ; go vet ./...

@TomSweeneyRedHat
Copy link
Member

LGTM
@nalind and/or @mtrmac PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

image Related to "image" package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants