-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Consolidate _handshakeBuffer and _internalBuffer in SslStream #64747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @dotnet/ncl, @vcsjones Issue DetailsFixes #52037. I the two buffers have been merged together since they are never used in parallel (i.e. receciving appdata is not allowed during renegotiation etc.). Will need performance testing before merging.
|
We intentionally return the buffer at certain times in order to reduce memory usage. In particular, when the user does a zero-byte read. @wfurt did a bunch of work to enable this not too long ago (I think it was in 6.0?). We should not regress this. Edit: Actually, the work from @wfurt was to read from multiple TLS frames if we can. I think we already supported the zero-byte read optimization before that. Regardless, here's the relevant code for zero byte read handling: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs#L913 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it generally looks good to me. I left few comments.
private int _decryptedLength; | ||
|
||
// padding between decrypted part of the active memory and following undecrypted TLS frame. | ||
private int _decryptedPadding; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
padding seems little bit strange to me. I would probably just track where the encrypted chunk starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a small optimization, by keeping the padding length we have one less field to update when we discard data read by the user code.
@@ -360,8 +345,7 @@ private async Task ForceAuthenticationAsync<TIOAdapter>(TIOAdapter adapter, bool | |||
|
|||
if (!handshakeCompleted) | |||
{ | |||
// get ready to receive first frame | |||
_handshakeBuffer = new ArrayBuffer(InitialHandshakeBufferSize); | |||
_buffer = new SslBuffer(ReadBufferSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if that matters but we used separate size for handshake to limit memory used. While during transfer it is very likely you have nore data and you would be receiving full TLS frames, handshake messages are typically pretty small.
Using ReadBufferSize makes them come from same pool e.g. it may increase efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since SslBuffer
is struct, I'm wondering if we can just initialized automatically in constructor and here simply have function to allocate/resize/validate the actual buffer. -> EnsureAvailableSpace()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -286,7 +273,8 @@ private async Task RenegotiateAsync<TIOAdapter>(TIOAdapter adapter) | |||
throw SslStreamPal.GetException(status); | |||
} | |||
|
|||
_handshakeBuffer = new ArrayBuffer(InitialHandshakeBufferSize); | |||
_buffer = new SslBuffer(ReadBufferSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we could just use EnsureAvailableSpace here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is expected that if no data is buffered, the _buffer
is disposed. We would still have to new the inner ArrayBuffer internally in that case.
ReadOnlySpan<byte> availableData = _buffer.EncryptedReadOnlySpan; | ||
// DiscardEncrypted() does not touch data, it just increases start index so next | ||
// EncryptedSpan will exclude the "discarded" data. | ||
_buffer.DiscardEncrypted(frameSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is exciting code. But I'm wondering now if there is possibility this could confuse some of the cleanup logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, since the data are read immediately afterwards and there are those locks that prevent multiple concurrent reads or other conflicting operations.
_decryptedBytesOffset = 0; | ||
ArrayPool<byte>.Shared.Return(internalBuffer); | ||
} | ||
else if (_decryptedBytesCount == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this matters but the else would put next decrypted data to the beginning of the buffer.
The encrypted should be always same or bigger so it may not matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does not matter, since ResetReadBuffer
is called before receiving and that function moves the data if there is any.
@@ -1030,17 +983,17 @@ private async ValueTask<int> ReadAsyncInternal<TIOAdapter>(TIOAdapter adapter, M | |||
private ValueTask FillHandshakeBufferAsync<TIOAdapter>(TIOAdapter adapter, int minSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not need this, right? Since there is only one buffer now EnsureFullTlsFrameAsync
should do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
// To maximize the buffer space available for the next read, | ||
// copy the existing data down to the beginning of the buffer. | ||
Buffer.BlockCopy(_internalBuffer, _internalOffset, _internalBuffer, 0, _internalBufferCount); | ||
_internalOffset = 0; | ||
_buffer.EnsureAvailableSpace(_buffer.Capacity - _buffer.ActiveLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think is same. Even if there is space in the buffer, whatever was left may be at the end ... limiting size of next IO. It would be interesting so how this impacts perf but if the chunk is small and copy fast the previous code would allow to fill the buffer completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnsureAvailableSpace
moves the data to the beginning of the buffer if enough available space can be gained that way (see ArrayBuffer.EnsureAvailableSpace
)
I haven't seen any significant change in perf benchmarks (changes were up to 2% in either direction)
6c9a488
to
b5608bb
Compare
I have tried to copy the non-async-machine path from WIP version private ValueTask<int> EnsureFullTlsFrameAsync<TIOAdapter>(TIOAdapter adapter)
where TIOAdapter : IReadWriteAdapter
{
int frameSize;
if (HaveFullTlsFrame(out frameSize))
{
return ValueTask.FromResult(frameSize);
}
// We may have enough space to complete frame, but we may still do extra IO if the frame is small.
// So we will attempt larger read - that is trade of with extra copy.
// This may be updated at some point based on size of existing chunk, rented buffer and size of 'buffer'.
_buffer.EnsureAvailableSpace(Math.Min(frameSize, InitialBufferSize) - _buffer.ActiveLength);
while (_buffer.EncryptedLength < frameSize)
{
// there should be space left to read into
Debug.Assert(_buffer.AvailableLength > 0, "_buffer.AvailableBytes > 0");
// We either don't have full frame or we don't have enough data to even determine the size.
ValueTask<int> t = adapter.ReadAsync(_buffer.AvailableMemory);
if (!t.IsCompletedSuccessfully)
{
return InternalEnsureFullTlsFrameAsync(adapter, t, frameSize);
}
int bytesRead = t.Result;
if (bytesRead == 0)
{
if (_buffer.EncryptedLength != 0)
{
// we got EOF in middle of TLS frame. Treat that as error.
throw new IOException(SR.net_io_eof);
}
return ValueTask.FromResult(0);
}
_buffer.Commit(bytesRead);
if (frameSize == int.MaxValue && _buffer.EncryptedLength > TlsFrameHelper.HeaderSize)
{
// recalculate frame size if needed e.g. we could not get it before.
frameSize = GetFrameSize(_buffer.EncryptedReadOnlySpan);
_buffer.EnsureAvailableSpace(frameSize - _buffer.ActiveLength);
}
}
return ValueTask.FromResult(frameSize);
async ValueTask<int> InternalEnsureFullTlsFrameAsync(TIOAdapter adapter, ValueTask<int> task, int frameSize)
{
while (true)
{
int bytesRead = await task.ConfigureAwait(false);
if (bytesRead == 0)
{
if (_buffer.EncryptedLength != 0)
{
// we got EOF in middle of TLS frame. Treat that as error.
throw new IOException(SR.net_io_eof);
}
return 0;
}
_buffer.Commit(bytesRead);
if (_buffer.EncryptedLength >= frameSize)
{
break;
}
task = adapter.ReadAsync(_buffer.AvailableMemory);
if (frameSize == int.MaxValue && _buffer.EncryptedLength > TlsFrameHelper.HeaderSize)
{
// recalculate frame size if needed e.g. we could not get it before.
frameSize = GetFrameSize(_buffer.EncryptedReadOnlySpan);
_buffer.EnsureAvailableSpace(frameSize - _buffer.ActiveLength);
}
}
return frameSize;
}
} |
I don't think we need it. We abandoned it |
7a2aecb
to
4fe9db5
Compare
/azp run runtime-libraries-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
_internalBuffer = null; | ||
_internalBufferCount = 0; | ||
_internalOffset = 0; | ||
//_internalBuffer = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should remove it if not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I left one comment to fix.
There is one possibly related test failure in outerloop tests:
|
/azp run runtime-libraries-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
CI failures are unrelated:
|
Fixes #52037.
I the two buffers have been merged together since they are never used in parallel (i.e. receciving appdata is not allowed during renegotiation etc.).
One thing that is kinda open still is when to rent/return the actual buffer, currently, the code rents the buffer during construction and keeps it until the
SslStream
is disposed. Maybe we can do better.Should probably undergo performance testing before merging.