Skip to content

Ensure that JSON properties can be skipped without buffering their entire contents. #96856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 12, 2024

Conversation

eiriktsarpalis
Copy link
Member

@eiriktsarpalis eiriktsarpalis commented Jan 11, 2024

Makes the following changes:

  • Adds a Utf8JsonReader.TrySkipPartial() method that enables resumable skip operations in streaming deserialization.
  • Removes the need to read ahead the next JSON value whenever streaming deserialization is resumed.
  • Adds an outerloop unit tests validating that 5 GB JSON documents with ignored properties can be deserialized.

Fix #96559.

@eiriktsarpalis eiriktsarpalis added this to the 9.0.0 milestone Jan 11, 2024
@ghost ghost assigned eiriktsarpalis Jan 11, 2024
@ghost
Copy link

ghost commented Jan 11, 2024

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

Makes the following changes:

  • Adds a Utf8JsonReader.TrySkipPartial() method that enables resumable skip operations in streaming deserialization.
  • Removes the need to read ahead the next JSON value whenever streaming deserialization is resumed.

Fix #96559.

Author: eiriktsarpalis
Assignees: -
Labels:

area-System.Text.Json

Milestone: -

@eiriktsarpalis eiriktsarpalis added the tenet-performance Performance related issue label Jan 11, 2024
@@ -21,6 +21,59 @@ public static ReadOnlySpan<byte> GetSpan(this scoped ref Utf8JsonReader reader)
return reader.HasValueSequence ? reader.ValueSequence.ToArray() : reader.ValueSpan;
}

/// <summary>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic moved essentially unchanged from JsonConverter.ReadAhead.cs

/// </summary>
/// <param name="targetDepth">The target depth we want to eventually skip to.</param>
/// <returns>True if the entire JSON value has been skipped.</returns>
internal bool TrySkipPartial(int targetDepth)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing TrySkipHelper has been retrofitted to the resumable TrySkipPartial method, minus the checkpointing semantics.

{
// For a continuation, read ahead here to avoid having to build and then tear
// down the call stack if there is more than one buffer fetch necessary.
if (!SingleValueReadWithReadAhead(requiresReadAhead: true, ref reader, ref state))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the crux of the second change made by this PR: re-entrant serialization operations no longer perform read-ahead of the next value (the converter doesn't require it in this case). We let the JsonSerializerOptions.DefaultBufferSize be the only guide of how much data we add to the buffer before re-entering deserialization.

if (!isContinuation)
{
#if DEBUG
// For performance, only perform token type validation of converters on debug builds.
Debug.Assert(state.Current.OriginalTokenType == JsonTokenType.None);
state.Current.OriginalTokenType = reader.TokenType;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now need to store this value in all builds to track resumable skip operations.

[Fact]
[SkipOnCoreClr("https://github.com/dotnet/runtime/issues/45464", ~RuntimeConfiguration.Release)]
public async Task ReadSimpleObjectAsync()
[Theory]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted this test suite to use theories because it was causing me grief while debugging.

// Read the key name.
if (!reader.Read())
{
value = default;
return false;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the order of state update operations since we no longer rely on the root method to advance the reader for us on resumed operations.

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks for the comments you added in the PR, it helped me understand the changes.

@eiriktsarpalis eiriktsarpalis merged commit 59fb3be into dotnet:main Jan 12, 2024
tmds pushed a commit to tmds/runtime that referenced this pull request Jan 23, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Feb 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Streaming JsonSerializer buffers all content of skipped JSON properties.
2 participants