Add CRC32 validation when reading zip archive entries#124766
Add CRC32 validation when reading zip archive entries#124766alinpahontu2912 wants to merge 1 commit intodotnet:mainfrom
Conversation
Add CrcValidatingReadStream to ZipCustomStreams that wraps decompressed entry streams and validates CRC32 checksums as data is read. The stream computes a running CRC32 over all bytes read and compares it against the expected CRC from the zip entry header when the expected number of bytes has been read. Key design decisions: - Seeking is delegated to the base stream but abandons CRC tracking, since CRC validation requires sequential reading from the start. - Flush throws NotSupportedException consistent with read-only streams. - Corrupted entries (tampered uncompressed size) now correctly throw InvalidDataException when CRC mismatch is detected. Update existing corrupted stream tests to expect InvalidDataException from CRC validation instead of silently accepting tampered data. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @karelz, @dotnet/area-system-io-compression |
There was a problem hiding this comment.
Pull request overview
This PR adds CRC32 validation when reading ZIP archive entries to detect data corruption. A new CrcValidatingReadStream wrapper class is introduced that computes a running CRC32 checksum as data is read and validates it against the expected CRC from the ZIP entry header once all expected bytes have been read.
Changes:
- Added
CrcValidatingReadStreamclass that wraps decompressed entry streams and validates CRC32 checksums - Updated
ZipArchiveEntryto wrap entry streams withCrcValidatingReadStreamin read mode - Updated corrupted stream tests to expect
InvalidDataExceptionfrom CRC validation instead of silently accepting tampered data - Added
CrcMismatchresource string for the validation error message
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipCustomStreams.cs |
Added new CrcValidatingReadStream class that validates CRC32 checksums as data is read from ZIP entry streams |
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchiveEntry.cs |
Updated OpenInReadMode and OpenInReadModeGetDataCompressor to return CrcValidatingReadStream and wrap decompressed streams with CRC validation |
src/libraries/System.IO.Compression/src/Resources/Strings.resx |
Added CrcMismatch error message for CRC validation failures |
src/libraries/System.IO.Compression/tests/ZipArchive/zip_InvalidParametersAndStrangeFiles.cs |
Updated corrupted stream tests to expect InvalidDataException when reading tampered entries, simplified tests to focus on CRC validation |
| public CrcValidatingReadStream(Stream baseStream, uint expectedCrc, long expectedLength) | ||
| { | ||
| _baseStream = baseStream; | ||
| _expectedCrc = expectedCrc; | ||
| _expectedLength = expectedLength; | ||
| } |
There was a problem hiding this comment.
Consider validating that expectedLength is non-negative in the constructor. Negative values don't make sense for a length and could cause unexpected behavior in the comparison at line 808 and 822.
| public CrcValidatingReadStream(Stream baseStream, uint expectedCrc, long expectedLength) | ||
| { | ||
| _baseStream = baseStream; | ||
| _expectedCrc = expectedCrc; | ||
| _expectedLength = expectedLength; | ||
| } |
There was a problem hiding this comment.
The constructor should validate that baseStream is not null. Without this check, a null baseStream would result in a NullReferenceException when any stream operation is attempted, rather than an ArgumentNullException at construction time.
|
|
||
| public CrcValidatingReadStream(Stream baseStream, uint expectedCrc, long expectedLength) | ||
| { | ||
| _baseStream = baseStream; |
There was a problem hiding this comment.
The _runningCrc field should be explicitly initialized to 0 in the constructor for clarity, even though the default value for uint is 0. This makes the intent clear that CRC32 calculation starts with an initial value of 0, consistent with the pattern used in CheckSumAndSizeWriteStream (line 503).
| _baseStream = baseStream; | |
| _baseStream = baseStream; | |
| _runningCrc = 0; |
| if (_totalBytesRead >= _expectedLength) | ||
| { | ||
| ValidateCrc(); | ||
| } |
There was a problem hiding this comment.
The condition should check for equality (_totalBytesRead == _expectedLength) instead of greater-than-or-equal. The current implementation allows ValidateCrc() to be called on every read after _expectedLength is reached, even though it will only validate once when exactly equal. Using equality would ensure validation happens exactly once and makes the logic clearer. Additionally, if _totalBytesRead somehow exceeds _expectedLength (which shouldn't happen given the decompressor limits), it would be better to detect and handle this case explicitly.
| if (_totalBytesRead >= _expectedLength) | |
| { | |
| ValidateCrc(); | |
| } | |
| if (_totalBytesRead == _expectedLength) | |
| { | |
| ValidateCrc(); | |
| } | |
| else if (_totalBytesRead > _expectedLength) | |
| { | |
| throw new InvalidDataException(SR.CrcMismatch); | |
| } |
Add CrcValidatingReadStream to ZipCustomStreams that wraps decompressed entry streams and validates CRC32 checksums as data is read. The stream computes a running CRC32 over all bytes read and compares it against the expected CRC from the zip entry header when the expected number of bytes has been read.
Update existing corrupted stream tests to expect InvalidDataException from CRC validation instead of silently accepting tampered data.