Skip to content

Add optional server-side read-time checksum verification on Disk GET (detect silent bit-rot) #240

Description

@SymoHTL

Summary

The Disk provider validates checksums on write and persists them, but never re-verifies them on read. A silent on-disk corruption (bit-rot, torn sector) is served to clients without detection, and async replica-repair copies the (possibly corrupt) source bytes faithfully.

Surfaced by a SeaweedFS comparison. SeaweedFS re-verifies a per-needle CRC32C on every GET, so bit-rot is caught at serve time.

Evidence (current code)

  • Write-time: src/IntegratedS3/IntegratedS3.Provider.Disk/DiskStorageService.cs:6377-6420 (ValidateRequestedChecksums, InvalidChecksum on mismatch); :2635-2640 compute + validate + persist.
  • Read-time: GetObjectCoreAsync opens the content FileStream (TryOpenObjectReadStream) and streams raw bytes to the response — no recomputation/compare against the stored checksum. (grep for read-time verify in IntegratedS3.Core/Disk provider returns nothing.)

Reference: SeaweedFS mechanism

readNeedleTail recomputes CRC32C (Castagnoli) over the needle data and compares it to the 4-byte on-disk trailer on every read, returning ErrorCorrupted on mismatch (weed/storage/needle/needle_read_tail.go:11-27); same path used by scrub, EC scrub and load-time integrity check.

Suggested fix

Add an optional server-side read-time verification mode to the Disk backend: when object metadata carries a whole-object CRC32C/SHA-256, wrap the read stream in a hashing stream and compare at end-of-stream, failing the response (or flagging + logging) on mismatch. Gate behind an option (e.g. DiskStorageOptions.VerifyChecksumOnRead) since it adds a full read-time hash pass. Also verify the source checksum before an async replica-repair re-PUT so repair cannot propagate corruption.

Impact / payoff

  • Detects silent disk bit-rot at serve time instead of relying solely on client-side SDK validation.
  • Prevents replica repair from faithfully replicating corrupt bytes.
  • Note: in the S3 model a compliant SDK validates the returned checksum end-to-end, so this is defense-in-depth for server-owned durability — hence opt-in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions