-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Tar: fix handing of null terminated fields. #117410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The current implementation is trimming spaces and nulls from the end. Instead we need to find the terminating null from the start, and we shouldn't trim spaces at the end.
Tagging subscribers to this area: @dotnet/area-system-formats-tar |
I have not included a test because the PR is removing some special handling and not introducing any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates how null-terminated fields are handled by stopping at the first null byte rather than trimming trailing nulls and spaces, and it adjusts all consumers accordingly.
- Introduced
TrimNullTerminated
to handle null-terminated spans - Updated
ParseOctal
to use the new trimming logic and only ignore spaces - Replaced all
GetTrimmed*
calls inTarHeader.Read.cs
withParseUtf8String
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
TarHelpers.cs | Added TrimNullTerminated , updated ParseOctal and ParseUtf8String , and removed old helpers |
TarHeader.Read.cs | Replaced GetTrimmedUtf8String calls with ParseUtf8String |
Comments suppressed due to low confidence (3)
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHelpers.cs:274
- Add an XML doc comment for
ParseUtf8String
explaining that it returns the UTF-8 string up to the first null byte without trimming spaces.
internal static string ParseUtf8String(ReadOnlySpan<byte> buffer)
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHelpers.cs:280
- Add an XML doc comment for
TrimNullTerminated
to clarify that it slices the span at the first null byte and leaves trailing spaces intact.
internal static ReadOnlySpan<byte> TrimNullTerminated(ReadOnlySpan<byte> buffer)
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHelpers.cs:243
- Add unit tests for
ParseOctal
andParseUtf8String
covering scenarios with fields that include trailing spaces after the null terminator and fields with no null byte.
buffer = TrimNullTerminated(buffer);
We should have a test that validates that we trim correctly since we did that wrong in an observable way before. So a tar with an embedded null should truncate the name at the null. You can add such a file to https://github.com/dotnet/runtime-assets/tree/main/src/System.Formats.Tar.TestData or you could produce a tar and then modify the bytes. |
I'll add a test case. Also, CI found some failing tests which I'll need to look into. |
…aOffset_RegularFile_SecondEntry test to fail.
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarReader.cs
Outdated
Show resolved
Hide resolved
@ericstj I've created an archive that looks like:
Note that the filename is The test will extract this entry and validate the name is Is this sufficient? To use this archive in a test, I need to add it to the https://github.com/dotnet/runtime-assets repo? |
You can either
We have cases of all of these. My preference would be 2 since it's most readable, but not sure if that's possible. Since it's so small I think 3 is acceptable and avoids the need to modify runtime-assets. |
I think this can work. I'll try tomorrow. |
@ericstj I've added a test. Can you take another look? |
The current implementation is trimming spaces and nulls from the end.
Instead we need to find the terminating null from the start, and we shouldn't trim spaces at the end.
Fixes #117331.
@ericstj @ViktorHofer ptal.
cc @bruce965