Skip to content

Conversation

@iremyux
Copy link
Contributor

@iremyux iremyux commented Nov 28, 2025

This PR adds a new overload to ZipArchiveEntry.Open() that accepts a FileAccess parameter, allowing users to specify the desired access mode when opening an entry stream.

When a ZipArchive is opened in ZipArchiveMode.Update, calling ZipArchiveEntry.Open() always returns a read-write stream by invoking OpenInUpdateMode(). This causes the entire entry to be decompressed into memory, even when the caller only intends to read the entry's contents.

Update Mode Details:

  • FileAccess.Read: Opens a read-only stream over the entry's compressed data without loading it into memory. Useful for streaming reads.

  • FileAccess.Write: Opens an empty writable stream, discarding any existing entry data. Semantically equivalent to "replace the entry content entirely" (like FileMode.Create. The stream is backed by a MemoryStream and content is written to the archive when disposed.

  • FileAccess.ReadWrite: Same as parameterless Open() - loads existing data into memory and returns a read/write/seekable stream.

Fixes #101243

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

@iremyux iremyux marked this pull request as ready for review November 28, 2025 11:26
Copilot AI review requested due to automatic review settings November 28, 2025 11:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new Open(FileAccess access) overload to ZipArchiveEntry that allows callers to explicitly specify the desired file access mode when opening an entry. This is particularly valuable in ZipArchiveMode.Update, where the existing parameterless Open() always returns a read-write stream by decompressing the entire entry into memory. The new overload allows specifying FileAccess.Read to avoid this memory overhead when only reading is needed.

Key Changes:

  • New Open(FileAccess access) method validates that the requested access is compatible with the archive's mode and calls the appropriate internal open method (OpenInReadMode, OpenInWriteMode, or OpenInUpdateMode)
  • Three new error message resources for incompatible access modes
  • Comprehensive test coverage for all combinations of archive modes and access types

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchiveEntry.cs Implements the new Open(FileAccess access) overload with validation logic for different archive modes
src/libraries/System.IO.Compression/ref/System.IO.Compression.cs Adds the new public API surface to the reference assembly
src/libraries/System.IO.Compression/src/Resources/Strings.resx Adds three new error message resources for invalid access scenarios
src/libraries/System.IO.Compression/tests/ZipArchive/zip_InvalidParametersAndStrangeFiles.cs Adds 11 new test methods covering all valid and invalid access combinations for each archive mode

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/libraries/System.IO.Compression/tests/ZipArchive/zip_InvalidParametersAndStrangeFiles.cs:2256

  • The comment states that ReadWrite "opens in write mode" but based on the implementation (line 392 in ZipArchiveEntry.cs), it actually calls OpenInWriteMode() which may return a stream with different capabilities than implied. The comment should clarify what capabilities the returned stream actually has. Consider: "// ReadWrite is allowed in Create mode and calls OpenInWriteMode()"
            // ReadWrite should be allowed in Create mode (it opens in write mode)

@iremyux iremyux requested review from a team and removed request for a team December 1, 2025 12:42
@iremyux iremyux marked this pull request as draft December 1, 2025 14:46
@iremyux iremyux marked this pull request as ready for review December 10, 2025 22:35
@iremyux iremyux requested a review from Copilot December 10, 2025 22:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment on lines +381 to +383
/// <summary>
/// Opens the entry with the specified access mode. This allows for more granular control over the returned stream's capabilities.
/// </summary>
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML documentation summary doesn't adequately describe what each FileAccess mode does in different ZipArchiveModes, particularly the important behavioral differences in Update mode. Consider adding a remarks section that explains:

  • In Read mode: only FileAccess.Read is allowed
  • In Create mode: FileAccess.Write and FileAccess.ReadWrite are allowed (both behave as write-only)
  • In Update mode:
    • FileAccess.Read opens a read-only stream without loading into memory
    • FileAccess.Write provides an empty writable stream, discarding existing data
    • FileAccess.ReadWrite loads existing data into memory for modification (same as parameterless Open())

This is especially important because the behavior in Update mode with FileAccess.Write (discarding existing data) is semantically different from intuitive expectations.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should include the explanation in the remarks section.

Comment on lines +417 to +418
FileAccess.ReadWrite => OpenInUpdateMode(),
_ => throw new UnreachableException()
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UnreachableException in the switch expression default case is unnecessary. Since the access parameter is validated at lines 394-395 to be one of Read, Write, or ReadWrite, and all three cases are explicitly handled in the switch expression (lines 415-417), the default case at line 418 can never be reached. This makes the code slightly harder to reason about. Consider removing the default case entirely, as the switch expression will be exhaustive without it.

Suggested change
FileAccess.ReadWrite => OpenInUpdateMode(),
_ => throw new UnreachableException()
FileAccess.ReadWrite => OpenInUpdateMode()

Copilot uses AI. Check for mistakes.
Comment on lines +130 to 131
public System.IO.Stream Open(FileAccess access) { throw null; }
public System.Threading.Tasks.Task<System.IO.Stream> OpenAsync(System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have overlooked something again during API review, we should also have an Async version of the new overload.

I think letting API review board over an email should be enough and we don't need to wait for the response.

Comment on lines +381 to +383
/// <summary>
/// Opens the entry with the specified access mode. This allows for more granular control over the returned stream's capabilities.
/// </summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should include the explanation in the remarks section.

_currentlyOpenForWrite = true;

// Create a fresh empty MemoryStream, discarding any previously loaded data
_storedUncompressedData = new MemoryStream();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is overwriting an existing stream here, we want to dispose it first.

return OpenInWriteModeCore();
}

private WrappedStream OpenInWriteModeForUpdate()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks similar to OpenInUpdateMode, Would it be still readable if we added a parameter there that would control whether existing contents should be returned or not?


[Theory]
[MemberData(nameof(Get_Booleans_Data))]
public static async Task OpenWithFileAccess_CreateMode_ReadWriteAccess_Succeeds(bool async)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks the same as the test above (only difference is the FileAccess), maybe take the FileAccess as another parameter?


[Theory]
[MemberData(nameof(Get_Booleans_Data))]
public static async Task OpenWithFileAccess_UpdateMode_InvalidAccess_Throws(bool async)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the two above are almost the same. I think having only one of them is enough as the argument check happens before the logic branches on the ZipArchiveMode, but if you want to keep tests for all combinations, make it into a single parameterized test.

ThrowIfInvalidArchive();

if (access is not FileAccess.Read and not FileAccess.Write and not FileAccess.ReadWrite)
throw new ArgumentException(SR.InvalidFileAccess, nameof(access));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ArgumentOutOfRangeException is better fit here

Comment on lines +401 to +408
if (access != FileAccess.Read)
throw new ArgumentException(SR.CannotBeWrittenInReadMode, nameof(access));
return OpenInReadMode(checkOpenable: true);

case ZipArchiveMode.Create:
if (access == FileAccess.Read)
throw new ArgumentException(SR.CannotBeReadInCreateMode, nameof(access));
return OpenInWriteMode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ArgumentException, I think InvalidOperationException is more appropriate. The argument itself is in valid range, but the operation it tries to perform is not valid in the current state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[API Proposal]: Add ZipArchiveEntry.Open overload

2 participants