-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Span support to Webencoders #334
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes for review
@@ -22,6 +29,8 @@ namespace Microsoft.Extensions.Internal | |||
#endif | |||
static class WebEncoders | |||
{ | |||
private const int MaxStackallocBytes = 256; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below this limit local buffers will be stack allocated. Is this limit OK?
using System.Buffers.Text; | ||
|
||
#if !NET461 | ||
using System.Numerics; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector<T>
isn't part of net461, so I've #if
ed it out. Is there a way to bring Vector<T>
to net461 in this project (linked file) or shouldn't we don't care about this 😉
} | ||
|
||
Span<char> base64 = stackalloc char[base64Len]; | ||
Convert.TryToBase64Chars(data, base64, out int written); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert.TryToBase64Chars
is noticeable faster than Base64.EncodeToUtf8
. Maybe because base64-encoding and byte -> char is done in one step.
Interestingly on the decoding-methods the Base64
-methods are faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @ahsonkhan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last I checked (back in October), the span APIs were on par or faster: dotnet/corefx#24888
@atsushikan, recently did a port of the optimizations to the Convert methods: dotnet/coreclr#17033
There shouldn't be much difference on either now (in the general, non-allocating cases). Can you share details on what type of input you are measuring performance on and maybe a stripped down repro? Otherwise, I will take a deeper look in a couple of days. Thanks for bringing it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I had the wrong suspects. Sorry. When running a stripped down sequential repo perf is on par, as noticed by @ahsonkhan
In the vectorized version the difference shows up.
if (Vector.IsHardwareAccelerated && (int*)n >= (int*)Vector<T>.Count)
will make the difference.
For Base64.EncodeToUtf8
T
will be byte
, so on AVX2 Vector<T>.Count
is 32.
For Convert.TryToBase64Chars
T
will be char
(or to be precise ushort
), hence Vector<T>.Count
is 16.
For a Guid n
is 24. So the Convert-version will urlencode vectorized, whereas the Base64-version will run sequential only.
} | ||
|
||
private static int GetNumBase64PaddingCharsToAddForDecode(int inputLength) | ||
// TODO: replace IntPtr and (int*) with nuint once available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nuint
isn't avaialbe, so it is a bit messy.
The other workaround we be defining an using alias
, depending on an #if
for the bitness. On this linked file I didn't know how to do this, so I used the IntPtr
-approach.
The TODO comment is to remind me for the change once available.
[InlineData("a-b_c", "a+b/c==")] | ||
[InlineData("a-b_c-d", "a+b/c+d=")] | ||
[InlineData("abcd", "abcd")] | ||
public void UrlDecode_ReturnsValid_Base64String(string text, string expectedValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed a safety net for the private class UrlEncoder
, so these tests were added.
Is it OK to test implementation details or shall I remove these tests?
[InlineData("a+b/c== ", "a-b_c")] | ||
[InlineData("a+b/c ", "a-b_c ")] | ||
[InlineData("abcd", "abcd")] | ||
public void UrlEncode_Replaces_UrlEncodableCharacters(string base64EncodedValue, string expectedValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
/// <summary> | ||
/// Invalid input, that doesn't conform a base64 string. | ||
/// </summary> | ||
internal static readonly string WebEncoders_InvalidInput = "The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class has a nice TODO
// TODO using a resx file. project.json, unfortunately, fails to embed resx files when there are also compile items
// in the contentFiles section. Revisit once we convert repos to MSBuild
Shall I fix this -- or in a different PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do a different PR if it will be a noisy change.
For aspnet/SignalR#1606 (comment) I can prepare a PR (based on https://github.com/gfoidl/SignalR/commit/afb48976a468351b66f033306fef5654b3401252) |
? GetBufferSizeRequiredToUrlDecode(base64Url.Length, out int dataLength) | ||
: base64Url.Length; | ||
|
||
if (base64Len > MaxStackallocBytes / sizeof(byte)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't sizeof(byte) 1? Why do we need this division?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is redundant, and just to have it the same as on the char-part. The JIT will treat it as constant anyway.
|
||
if (buffer.Length - bufferOffset < arraySizeRequired) | ||
if ((uint)buffer.Length < (uint)(bufferOffset + base64Len)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concerns here about integer overflow? bufferOffset + base64Len
could be > int.MaxValue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, the sum will be a negative value that is brought to very high values by the cast to uint
, so the exception will be thrown. For me it seems OK. Am I wrong?
Change is motivated by replacing the subtraction with the addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, other non-nit comments should be addressed.
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
public static int GetArraySizeRequiredToEncode(int count) | ||
{ | ||
if (count < 0 || count > MaxEncodedLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) This can be turned into a single check: if ((uint)count > (uint)MaxEncodedLength) { FAIL; }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 👍 thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed that for this method no tests exists. Will add some.
|
||
#if NETCOREAPP2_1 | ||
private static int Base64UrlEncodeCore(ReadOnlySpan<byte> data, Span<char> base64Url, int base64Len) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method assumes that base64Len
is exactly the number of characters required to represent the base64-encoded form of its input, not a single character more or less. It would be good to capture that invariant as a code comment and a Debug.Assert
to prevent maintainers inadvertently misusing this method in the future.
(This comment applies to other methods in the same family as this one.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be updated.
} | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
private static int GetNumBase64PaddingCharsToAddForDecode(int urlEncodedLen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) Potentially faster implementation (though untested):
int result = (4 - urlEncodedLen) & 3;
if (result == 3) { FAIL; }
return result;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing 😄
It's faster.
Method | Mean | Error | StdDev | Scaled | ScaledSD |
---|---|---|---|---|---|
A | 0.3861 ns | 0.0086 ns | 0.0076 ns | 1.00 | 0.00 |
B | 0.3436 ns | 0.0114 ns | 0.0107 ns | 0.89 | 0.03 |
|
||
private static int GetBufferSizeRequiredToBase64Encode(int dataLength) | ||
{ | ||
// overflow conditions are already eliminated, so 'checked' is not necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This invariant isn't always true. See for instance the callers Base64UrlEncode(ReadOnlySpan<byte>, Span<char>)
, Base64UrlEncode(byte[], int, char[], int, int)
, and Base64UrlEncode(ReadOnlySpan<byte>)
.
Additionally, checked builds should validate the invariant via Debug.Assert(dataLength >= 0 && dataLength <= MaxEncodedLength);
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct. I'll route these callers through the public GetArraySizeRequiredToEncode
, so that the invariant will hold. Thx!
m = (IntPtr)((int)(int*)n & ~(Vector<T>.Count - 1)); | ||
for (; (int*)i < (int*)m; i += Vector<T>.Count) | ||
{ | ||
var vec = Unsafe.As<T, Vector<T>>(ref Unsafe.Add(ref urlEncoded, i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe.ReadUnaligned<Vector<T>>(...)
(This same comment applies to UrlEncode<T>
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update to Unsafe.ReadUnaligned<Vector<T>>(...)
, but can you please explain me why? I see the JIT produces exactely the same code. What am I missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It produces the same on x86 / x64 but may misbehave on other architectures with more strict alignment requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK. Thanks for the info!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadUnaligned accepts void*
or ref byte
:
var vec = Unsafe.ReadUnaligned<Vector<T>>(ref Unsafe.As<T, byte>(ref Unsafe.Add(ref urlEncoded, i)));
So there is also an endianess-concern (#334 (comment)). Is there a better way to do this? (and I need a break 😉 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the line Unsafe.ReadUnaligned<Vector<T>>(...)
as you just wrote is endianness-safe. The reason is that the final pointer cast (before dereference) is of the expected type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense. Essentially it is going from T -> Vector<T>
.
So, for the same reason
Unsafe.WriteUnaligned(ref Unsafe.As<T, byte>(ref Unsafe.Add(ref urlEncoded, i)), vec);
is also endianess-safe. It is Vector<T> -> T
.
Thank you for the infos!
where TIn : struct | ||
where TOut : struct | ||
{ | ||
var value = Unsafe.As<TIn, byte>(ref Unsafe.Add(ref urlEncoded, idx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line assumes little-endian architecture. Better solution would be to get a local typed as TIn
, then switch on TIn
to forcibly convert it to byte
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change.
This line assumes little-endian architecture.
And I assumed endianess will be handled by Unsafe.As
. Miss-assumption?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect assumption. You're essentially performing reinterpret_cast<byte*>
from a char*
input, then dereferencing. The resulting value is endianness-dependent.
else if (typeof(TOut) == typeof(ushort)) | ||
{ | ||
Unsafe.Add(ref Unsafe.As<TOut, ushort>(ref base64), idx) = subst; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else
{
throw new NotSupportedException(); // just in case new types are introduced in the future
}
(This comment applies everywhere we special-case a generic type as byte
or ushort
.)
subst = (byte)'/'; | ||
} | ||
|
||
// With 'Unsafe.Add(ref output, idx) = Unsafe.As<byte, TOut>(ref subst);' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) We don't want the code suggested in this comment anyway, as you don't want to reinterpret the length of values on the stack. You could sweep up garbage data, and regardless it's not endianness-safe.
} | ||
#endif | ||
// n is always a multiple of 4 | ||
Debug.Assert((int)n == ((int)n & ~3)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) Change this to Debug.Assert((int)n % 4 == 0)
- it's clearer.
} | ||
|
||
private static int GetNumBase64PaddingCharsToAddForDecode(int inputLength) | ||
// TODO: replace IntPtr and (int*) with nuint once available | ||
internal static unsafe class UrlEncoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this type UrlCharacterSubstitution
or something similar, and change the name of its methods. This type is not actually performing URL encoding, so the name is confusing.
Thanks for reviewing @GrabYourPitchforks |
public void UrlDecode_ReturnsValid_Base64String(string text, string expectedValue) | ||
{ | ||
// To test the alternate code-path | ||
#if NETCOREAPP2_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this ifdefed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netcoreapp2.1 runs this test via UrlEncoder.UrlDecode(byte, byte)
, the other two (netcoreapp2.0, net461) runs this test via UrlEncoder.UrlDecode(char, char)
.
So both entries to the UrlEncode-method get testes -- though depending on tfm.
Not good?
/// - NeedMoreData - only if isFinalBlock is false and the input is not a multiple of 4, otherwise the partial input would be considered as InvalidData | ||
/// - InvalidData - if the input contains bytes outside of the expected base 64 range, or if it contains invalid/more than two padding characters, | ||
/// or if the input is incomplete (i.e. not a multiple of 4) and isFinalBlock is true.</returns> | ||
public static OperationStatus Base64UrlDecode(ReadOnlySpan<byte> base64Url, Span<byte> data, out int bytesConsumed, out int bytesWritten, bool isFinalBlock = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isFinalBlock? I haven't seen this is any APIs but the crypto ones, why is it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base64 has always needed it since you need to know whether to emit the trailing =
signs. Base64Url needs it because you need to know how to truncate the final 4-character block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... to enable buffer-chains -> https://github.com/aspnet/Home/issues/2966#issuecomment-373835888
using Microsoft.Extensions.WebEncoders.Sources; | ||
using System.Buffers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: sort usings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done -- will be updated with the other feedback changes.
BTW, I should point out that the amount of unsafe code being introduced here makes me a bit nervous. It would be great if there were evidence that real-world applications - not just microbenchmarks - would benefit from this change. |
We need to spanify all the things, what can we do to reduce the unsafe code. It's not just about throughput but allocations. |
So shall I go with (easy) sequential code? Allocations are still minimized, throughput will increase especially on longer inputs. It's quite a lot of "unsafeness" in there which I don't really like either. I wanted to squeeze everything out, because then going backward (i.e. removing some unsafe parts) seems easier to me than the other way round.
Thats nearly alway the point. I don't have any good context where this class is and will be used -- just assumed it as "low-level lib". |
Given the unsafe use and how this will accept untrusted input, this needs a lot of internal fuzz testing and other security reviews before merging. |
So I believe this one will take longer 😉 +public static string Base64UrlEncode(ReadOnlySpan<byte> data) Shall I create a very trimmed down PR that just handles this method (so it can go for 2.1)? |
Rebased due to conflict in |
* stack-allocs below a threshould * uses ArrayPool<T> above the threshould (netcoreapp2.1) -- new T[] otherwise
Only for netcoreapp2.1 -- rest will be done later.
Except * some TODOs * XML comments
* made to ref struct * reduced size by one int-field * Span as read-only property instead of AsSpan-method * fixed sized buffer is of type char instead of long
Benchmarks showed this variant is too slow, now it isn't needed anymore.
Rebased due to
|
What is this pending for, @gfoidl, @davidfowl? |
If we take #338, then we can close this one (would be my preference). |
OK lets close this one |
Description
WebEncoders
Due the latter point some
#if
s are needed, so the code doesn't look as nice as it could on some places.API
Benchmarks
Results from Benchmark-Project
Baseline
This PR
Notes
This benchmarks were run on ubuntu-x64. On win-x64 the results should be even better (see next section), but I don't have numbers, because it takes so long....
In
GetArraySizeRequiredToEncode
there is a little slow-down, but this shouldn't care.Individual benchmarks
Project for this benchmarks
Decode Guid
Decode Data
Encode Guid
Encode Data
Notes
Prospects
There could be a ~20% perf-win, when the base64-encoding / -decoding step would convert directly to / from url-encoded form by changing the base64-character map.
But by doing so we would "duplicate" the methods from
Base64
-class. This PR does rely on the provided methods -- maybe a follow-up PR will handle this.Private classes
In a normal project I would separate the private classes to their own files, but here the
WebEncoders.cs
is a linked file, so I stuffed everything in this file.