-
Notifications
You must be signed in to change notification settings - Fork 5.2k
JsonSerializer.DeserializeAsyncEnumerable should not wait until buffer is filled. #70430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JsonSerializer.DeserializeAsyncEnumerable should not wait until buffer is filled. #70430
Conversation
* Change DeserializeAsyncEnumerable so that reading from the underlying stream does not wait until the underlying buffer is full. * Encapsulate the buffer management logic behind ReadBufferState. * Avoid allocating a new JsonTypeInfo instance on every IAsyncEnumerator initialization.
|
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis Issue DetailsFrom #70107 (comment). The current implementation of the internal This PR changes the behavior of stream reading specifically in
Fix #70107.
|
src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.Stream.cs
Show resolved
Hide resolved
src/libraries/System.Text.Json/src/System/Text/Json/Serialization/ReadBufferState.cs
Show resolved
Hide resolved
src/libraries/System.Text.Json/src/System/Text/Json/Serialization/ReadBufferState.cs
Show resolved
Hide resolved
| new Span<byte>(_buffer, 0, _maxCount).Clear(); | ||
|
|
||
| byte[] toReturn = Buffer; | ||
| Buffer = null!; | ||
| byte[] toReturn = _buffer; | ||
| _buffer = null!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering you return the copy of this buffer state in some of the methods is it safe to do it like this? Should you add some guarding against double free?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing you always assign back the copy of the struct, please make sure to at least add comment on the method returning copy and the class itself. Alternative perhaps some #if DEBUG with extra code protecting us against this pattern (i.e. wrap buffer in ref type, set it to null on Dispose and check with Debug.Assert)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering you return the copy of this buffer state in some of the methods is it safe to do it like this? Should you add some guarding against double free?
Yes, assuming you somehow end up disposing two copies of the same struct then that is a possibility, but we're not doing this in the handful of methods that creates the struct.
perhaps some #if DEBUG extra code protecting us against this pattern (i.e. wrap buffer in ref type, set it to null on Dispose and check with Debug.Assert)
Can you give an example? Not sure I understand. FWIW this is not new code, I'm just moving existing logic to the struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadBufferState a = ...;
ReadBufferState b = a;
a.Dispose();
b.Dispose(); // double free or usage after dispose if we don't dispose and continue usingThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the condition, my question is what assert could guard against such a potential scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the struct you could potentially add nested class like this (under #if DEBUG):
private class BufferRef
{
public byte[]? Buffer;
}and then add field in the struct:
BufferRef _myBuffer = new BufferRef(); // assign _buffer to _myBuffer.Buffer;then if struct gets copied then ref gets copied as well but underlying field will be shared. Then in the Dispose you could clear that field and check if it's not null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per discussion offline, better to do this for all similar structs in separate PR in order to potentially find potential memory corruption bug
From #70107 (comment). The current implementation of the internal
ReadFromStreamAsyncmethod will keep reading from the stream until either the internal buffer is filled or the stream EOF's, which is intended as a performance optimization to avoid redundant deserialization calls. While this makes sense in the case ofDeserializeAsync, it can impact liveness when deserializing IAsyncEnumerable sequences in examples such as the one seen in #70107.This PR changes the behavior of stream reading specifically in
DeserializeAsyncEnumerablesuch that the deserializer is invoked as soon as the firstStream.ReadAsyncoperation returns. Moreover it makes a few more changes:ReadBufferStatestruct.Fix #70107.