-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(zip): ZipOutputStream async support #574
Conversation
This is a very rough first pass at adding async support to ZipOutputStream. All method doing synchronous stream writes have been duplicated with an asynchronous counterpart, starting from `ZipOutputStream.PutNextEntryAsync()`. Unfortunately, this makes a lot of duplicated code but is unavoidable to have both sync + async code paths. .NET Standard 2.1 has been added to the target frameworks. It's required to get full async support with `Stream.DisposeAsync()`. New unit tests must be written to cover the new async code paths. Fixes #223
Hi, just a question: do you think async support will be merged in in the near future? It seems a bit stopped. Could I help to get it in? |
@cloghead It's not the easiest thing to pick up for a new project, but some code review could be nice. This will likely be the focus for the next minor release. |
fwiw, I had a go at running a benchmark: With the current master branch as of now:
with this branch:
which is mixed, but no massive differences (though I think this branch is missing the Array.Empty optimization that might have a small negative effect, and I have no idea why the async memory usage for .net core 2.1 is so much lower than all the others) |
@Numpsy That's pretty much what I expected (not the odd memory usage ofc). The overhead that got removed was just instantiating |
Codecov Report
@@ Coverage Diff @@
## master #574 +/- ##
==========================================
+ Coverage 73.31% 73.82% +0.50%
==========================================
Files 68 69 +1
Lines 8305 8291 -14
==========================================
+ Hits 6089 6121 +32
+ Misses 2216 2170 -46
Continue to review full report at Codecov.
|
Ran the sync benchmark against the branch after merge and against master (no async since that isn't checked in?) Common output for both tests:
Current master:
async-ZipOutputStream:
|
I added an extra async benchmark locally, I can send a PR to add it if it's useful |
# Conflicts: # src/ICSharpCode.SharpZipLib/Zip/Compression/Streams/DeflaterOutputStream.cs # src/ICSharpCode.SharpZipLib/Zip/ZipOutputStream.cs
Hello guys, is this code going to be merged in the near future? I'm facing this issue where I need to create a zip out of a asynconlystream |
I haven't had a chance to look at anything for some time and have lost track of what state the async work was in, need to remind myself what was going on :-( |
@guipcarvalho That is the plan, but I have been busy with other projects. This should be ready to merge, I'll try to find some time to get it done this weekend. |
…c-ZipOutputStream
Thank you @piksel! |
I think I might have queried before, but don't recall the answer, so just in case - is there a need to check the behavior of GzipOutputStream with regards to the new functions on DeflaterOutputStream? (e.g. is it possible to call FinishAsync or DisposeAsync through a GzipOutputStream, and if so does doing so cause any problems?) |
@Numpsy I made a test that was simply an Async version of using (var gzos = new GZipOutputStream(msGzip))
{
gzos.IsStreamOwner = false;
await gzos.WriteAsync(inputBuffer, 0, inputBuffer.Length);
await gzos.FlushAsync();
await gzos.FinishAsync(CancellationToken.None);
} |
I tried it with var content = "FileContents";
using var ms = new MemoryStream();
await using (var outStream = new GZipOutputStream(ms) { IsStreamOwner = false })
{
outStream.FileName = "/path/to/file.ext";
var writeBuffer = Encoding.ASCII.GetBytes(content);
outStream.Write(writeBuffer, 0, writeBuffer.Length);
outStream.Flush();
await outStream.FinishAsync(default);
}
ms.Seek(0, SeekOrigin.Begin);
using (var inStream = new GZipInputStream(ms))
{
using (var sr = new StreamReader(inStream, Encoding.UTF8))
{
var readContent = sr.ReadToEnd();
Assert.AreEqual(content, readContent);
Assert.AreEqual("file.ext", inStream.GetFilename());
}
} and got this error: The test seems to pass if the read side is just using (var inStream = new GZipInputStream(ms))
{
var readBuffer = new byte[content.Length];
inStream.Read(readBuffer, 0, readBuffer.Length);
Assert.AreEqual(content, Encoding.ASCII.GetString(readBuffer));
Assert.AreEqual("file.ext", inStream.GetFilename());
} because I don't think it tries to read the gzip footer in order to read a subset of the stream data? |
Yeah, it's a bit weird, hence #672. |
Based on #547, but rather than duplicate the already massive zip format writing methods, it combines all the "logic" (which isn't really async anyway) and only does the actual writing using a-/sync methods.
This also moves the byte order swapping from the odd
ZipHelperStream
into a much more lightweight (and probably faster)ZipFormat
helper class. It should also help DRYing up the shared code inZipFile
andZip*Stream
.No actual benchmarks have been run yet, but the tests should pass.
I certify that I own, and have sufficient rights to contribute, all source code and related material intended to be compiled or integrated with the source code for the SharpZipLib open source product (the "Contribution"). My Contribution is licensed under the MIT License.