[release/8.0-staging] Set version in ZIP local header to ZIP64 when file offset is >4GB #103006
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #102053 to release/8.0-staging.
Fixes #94899
Customer Impact
Affecting two partner areas separately.
dotnet/Open-XML-SDK
Initially reported here: dotnet/wpf#2676
An external customer that develops PowerPoint Add-Ins uses DocumentFormat.OpenXML (a package owned by dotnet/Open-XML-SDK) to add custom xml parts to the PowerPoint presentation.
When they attempt to reopen the modified PowerPoint file either using Open XML SDK 2.5 Productivity Tool for Microsoft Office (an asset owned and released also by dotnet/Open-XML-SDK) or once again the
DocumentFormat.OpenXML
APIs, they fail stating that "The file contains corrupted data".But they have no problem opening the file with PowerPoint, or with other zip tools like WinRAR, 7-Zip.
microsoft/DacFx
This is also affecting our partners at microsoft/DacFx: microsoft/DacFx#363
Azure DevOps uses SqlPackage to export databases into bacpac files and then import them into Azure Sql Server instances.
When they import the bacpac using either Local Sql Server or Azure Sql Server, they have no problems. But when they import it using their Import Database UI or
az sql db import
, there's a failure stating the file contains corrupted data.All four cases use System.IO.Packaging to generate the bacpac file, but only the failing two cases use System.IO.Packaging again to attempt to import the file.
Root cause
Both issues have a common root cause: #94899
ZipArchive behaves differently in .NET and .NET Framework when creating or reading files larger than 4GB where the algorithm switches to using ZIP64 (described in this comment). The spec was not clear about that as described in this comment, but other tools do the right thing and can handle the files without issues, so we should match that behavior and ensure there's no discrepancy between .NET and .NET Framework.
The fix consists in making sure that both the central directory header and the local file header store the size offset and when the file is larger than 4GB, set the ZIP64 flag on both, and make sure .NET validates this just as .NET Framework used to do.
Testing
New unit tests added:
Risk
Low, as the behavior was wrong in .NET when compared to .NET Framework or other ZIP tools.
Thanks to @karakasa for contributing with the original fix.
cc @pwoosam @SeenaAugusty @llali