Skip to content

Crash resiliency: deleting incomplete layers doesn’t reliably happen #1136

@mtrmac

Description

@mtrmac

Consider this sequential sequence of events, with the overlay graph driver.

  • The user initiates pull of an image which contains 2 layers, parentLayer and childLayer
  • While creating parentLayer, the WIP layer object is recorded in layers.json with incompleteFlag.
  • Afterwards, during ApplyDiff, the pull process is forcibly killed (so that it can’t do its own cleanup).
  • Result: layers.json contain a record of the layer, with incompleteFlag; the overlay graph driver contains an incomplete/inconsistent layer, but a $parentLayer/link file and a l/$link symbolic link exist. This is all as expected.

  • The user initiates a pull of the same image again.
  • (Just like the first time), the pull first checks for pre-existing layers in storage, via Store.Layer(parentLayer). This locks the layerStore read-only first. Thus, the first layerStore.ReloadIfChanged does trigger a layerStore.Load(), but that does not clean up incomplete layers. But layerStore.lockFile.lw was updated to match the lock file contents.
  • Consequently, the record of the incomplete layer continues to exist, and Store.Layer reports that parentLayer exists.
  • Pull proceeds, assuming that parentLayer exists, and starts creating childLayer.
  • While creating childLayer, the layerStore is locked read-write, but because nothing has changed on disk and layerStore.lockFile.lw matches (within the same process), layerStore.ReloadIfChanged does nothing, and does not enter layerStore.Load() and the “delete incomplete layers” code is not reached. Consequently, parentLayer continues to exist in incomplete state.
  • This allows creation of childLayer to succeed. $childLayer/lower is created, and includes the short link from parentLayer/link.
  • Result: The whole pull is reported as successful. The image, though, contains an incomplete layer, with incomplete/inconsistent contents.

  • Next, the user does something that doesn’t start with a read-only lock of layerStore. That finally triggers layerStore.Load to delete incomplete layers — and now parentLayer is deleted, resulting in a broken parent link from childLayer to parentLayer.
  • For example, podman run theSameImage works for this purpose. That deletes the layer and fails with Error: layer not known (with a currently unclear call stack).

  • One more podman run theSameImage causes the missing layer to be noticed, with
ERRO[0000] Image theSameImage exists in local storage but may be corrupted: layer not known 
  • … and that triggers a re-pull.
  • This re-pull correctly detects that parentLayer is missing, and creates it afresh, with a new $parentLayer/link value.
  • But, childLayer is not missing, and the previous one is just reused. $childLayer/lower continues to contain the old $parentLayer/link value.
  • Finally, when trying to actually use childLayer, this manifests in
WARN[0093] Can't read link "/var/lib/containers/storage/overlay/l/UDGNJ5CR2MQ2QQDGYYK2W4WCBR" because it does not exist. A storage corruption might have occurred, attempting to recreate the missing symlinks. It might be best wipe the storage to avoid further errors due to storage corruption. 
Error: readlink /var/lib/containers/storage/overlay/l/UDGNJ5CR2MQ2QQDGYYK2W4WCBR: no such file or directory

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions