Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Span support to Webencoders #334

Closed
wants to merge 24 commits into from
Closed

Added Span support to Webencoders #334

wants to merge 24 commits into from

Conversation

gfoidl
Copy link
Member

@gfoidl gfoidl commented Apr 4, 2018

Description

  • Fixes https://github.com/aspnet/Home/issues/2966
  • added benchmark-project for WebEncoders
  • added Span-support to the API (see section API) and used the Span-based methods internally
  • small buffers (<= 256 bytes length) are stack allocated, larger ones rent from ArrayPool (except net461, there the large ones are heap allocated)
  • vectorized the url-encoding / -decoding step
  • more or less a rewrite of the class (and so it got a bit longer)
  • tried to reach best perf and minimal heap allocations for netcoreapp2.1, netcoreapp2.0, and net461

Due the latter point some #ifs are needed, so the code doesn't look as nice as it could on some places.

API

public static byte[] Base64UrlDecode(string input)
 public static byte[] Base64UrlDecode(string input, int offset, int count)
+public static byte[] Base64UrlDecode(ReadOnlySpan<char> base64Url)
+public static int Base64UrlDecode(ReadOnlySpan<char> base64Url, Span<byte> data)
+public static OperationStatus Base64UrlDecode(ReadOnlySpan<byte> base64Url, Span<byte> data, out int bytesConsumed, out int bytesWritten, bool isFinalBlock = true)
 public static byte[] Base64UrlDecode(string input, int offset, char[] buffer, int bufferOffset, int count)
 public static string Base64UrlEncode(byte[] input)
 public static string Base64UrlEncode(byte[] input, int offset, int count)
+public static string Base64UrlEncode(ReadOnlySpan<byte> data)
+public static int Base64UrlEncode(ReadOnlySpan<byte> data, Span<char> base64Url)
+public static OperationStatus Base64UrlEncode(ReadOnlySpan<byte> data, Span<byte> base64Url, out int bytesConsumed, out int bytesWritten, bool isFinalBlock = true)
 public static int Base64UrlEncode(byte[] input, int offset, char[] output, int outputOffset, int count)
 public static int GetArraySizeRequiredToDecode(int count)
 public static int GetArraySizeRequiredToEncode(int count)

Benchmarks

Results from Benchmark-Project

Baseline

BenchmarkDotNet=v0.10.11, OS=ubuntu 16.04
Processor=Intel Xeon CPU 2.60GHz, ProcessorCount=2
.NET Core SDK=2.1.300-preview3-008416
  [Host]     : .NET Core 2.1.0-preview2-26314-02 (Framework 4.6.26310.01), 64bit RyuJIT
  Job-GJOWXM : .NET Core 2.1.0-preview2-26314-02 (Framework 4.6.26310.01), 64bit RyuJIT

Runtime=Core  Server=True  Toolchain=.NET Core 2.1  
RunStrategy=Throughput  
Method Mean Error StdDev Median Op/s Gen 0 Allocated
Base64UrlDecode_Data 5,168.7843 ns 114.5844 ns 336.0562 ns 5,181.8617 ns 193,469.1 0.0301 1888 B
Base64UrlDecode_DataWithOffset 5,290.3002 ns 178.3533 ns 525.8787 ns 5,302.4528 ns 189,025.2 0.0302 1888 B
Base64UrlDecode_Guid 237.9833 ns 6.1025 ns 17.5093 ns 234.4482 ns 4,201,976.3 0.0017 112 B
Base64UrlEncode_Data 3,124.1497 ns 111.2723 ns 109.2843 ns 3,079.0000 ns 320,087.1 0.0455 2720 B
Base64UrlEncode_DataWithOffset 3,269.8675 ns 86.4926 ns 252.3030 ns 3,197.4447 ns 305,822.8 0.0455 2720 B
Base64UrlEncode_Guid 172.6008 ns 5.4054 ns 15.8531 ns 168.7662 ns 5,793,717.2 0.0024 144 B
GetArraySizeRequiredToDecode 6.6558 ns 0.2717 ns 0.8012 ns 6.5851 ns 150,244,480.7 - 0 B
GetArraySizeRequiredToEncode 0.0377 ns 0.0312 ns 0.0554 ns 0.0004 ns 26,507,091,393.8 - 0 B

This PR

BenchmarkDotNet=v0.10.11, OS=ubuntu 16.04
Processor=Intel Xeon CPU 2.60GHz, ProcessorCount=2
.NET Core SDK=2.1.300-preview3-008416
  [Host]     : .NET Core 2.1.0-preview2-26314-02 (Framework 4.6.26310.01), 64bit RyuJIT
  Job-MVRCXO : .NET Core 2.1.0-preview2-26314-02 (Framework 4.6.26310.01), 64bit RyuJIT

Runtime=Core  Server=True  Toolchain=.NET Core 2.1  
RunStrategy=Throughput  
Method Mean Error StdDev Median Op/s Gen 0 Allocated
Base64UrlDecode_Data 1,293.3192 ns 57.9273 ns 168.0575 ns 1,259.1617 ns 773,204.3 0.0076 528 B
Base64UrlDecode_DataWithOffset 1,335.1744 ns 61.1991 ns 178.5208 ns 1,306.9594 ns 748,965.8 0.0076 528 B
Base64UrlDecode_Guid 155.3893 ns 5.1800 ns 15.2734 ns 153.4095 ns 6,435,450.5 0.0005 40 B
Base64UrlEncode_Data 1,584.6087 ns 13.6427 ns 10.6513 ns 1,580.5804 ns 631,070.6 0.0228 1360 B
Base64UrlEncode_DataWithOffset 1,621.0625 ns 29.7223 ns 23.2052 ns 1,615.7145 ns 616,879.3 0.0226 1360 B
Base64UrlEncode_Guid 160.5265 ns 5.6154 ns 16.5571 ns 160.0300 ns 6,229,501.0 0.0012 72 B
GetArraySizeRequiredToDecode 0.0530 ns 0.0317 ns 0.0784 ns 0.0001 ns 18,869,922,106.3 - 0 B
GetArraySizeRequiredToEncode 0.0764 ns 0.0409 ns 0.0853 ns 0.0531 ns 13,088,763,160.8 - 0 B

Notes

This benchmarks were run on ubuntu-x64. On win-x64 the results should be even better (see next section), but I don't have numbers, because it takes so long....

In GetArraySizeRequiredToEncode there is a little slow-down, but this shouldn't care.

Individual benchmarks

Project for this benchmarks

Decode Guid

BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.309)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=2742189 Hz, Resolution=364.6722 ns, Timer=TSC
.NET Core SDK=2.1.300-preview3-008384
  [Host] : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT
  Clr	 : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2633.0
  Core	 : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT

Method Job Runtime Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
Default Clr Clr 159.8 ns 3.171 ns 3.525 ns 1.00 0.00 0.0355 112 B
New Clr Clr 126.4 ns 2.110 ns 1.973 ns 0.79 0.02 0.0126 40 B
Default Core Core 172.6 ns 3.249 ns 2.880 ns 1.00 0.00 0.0355 112 B
New Core Core 119.0 ns 2.292 ns 2.032 ns 0.69 0.02 0.0126 40 B

Decode Data

BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.309)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=2742189 Hz, Resolution=364.6722 ns, Timer=TSC
.NET Core SDK=2.1.300-preview3-008384
  [Host] : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT
  Clr	 : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2633.0
  Core	 : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT

Method Job Runtime DataSize Mean Error StdDev Median Scaled ScaledSD Gen 0 Allocated
Default Clr Clr 10 119.7 ns 2.4100 ns 2.6787 ns 119.4 ns 1.00 0.00 0.0303 96 B
New Clr Clr 10 124.7 ns 2.5604 ns 3.4181 ns 124.7 ns 1.04 0.04 0.0126 40 B
Default Core Core 10 131.0 ns 2.7016 ns 5.6392 ns 129.9 ns 1.00 0.00 0.0303 96 B
New Core Core 10 106.6 ns 3.3803 ns 9.6988 ns 103.2 ns 0.82 0.08 0.0126 40 B
Default Clr Clr 50 398.0 ns 7.8665 ns 9.3645 ns 398.5 ns 1.00 0.00 0.0758 240 B
New Clr Clr 50 203.1 ns 1.3501 ns 1.0541 ns 202.9 ns 0.51 0.01 0.0253 80 B
Default Core Core 50 430.2 ns 3.4316 ns 3.2099 ns 431.0 ns 1.00 0.00 0.0758 240 B
New Core Core 50 132.8 ns 0.7019 ns 0.6222 ns 133.0 ns 0.31 0.00 0.0253 80 B
Default Clr Clr 100 744.1 ns 10.4349 ns 8.7136 ns 742.6 ns 1.00 0.00 0.1345 424 B
New Clr Clr 100 380.2 ns 7.5157 ns 12.5570 ns 380.2 ns 0.51 0.02 0.0911 288 B
Default Core Core 100 866.1 ns 15.9919 ns 22.4184 ns 864.3 ns 1.00 0.00 0.1345 424 B
New Core Core 100 277.8 ns 5.5897 ns 9.9358 ns 278.1 ns 0.32 0.01 0.0405 128 B

Encode Guid

BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.309)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=2742189 Hz, Resolution=364.6722 ns, Timer=TSC
.NET Core SDK=2.1.300-preview3-008384
  [Host] : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT
  Clr	 : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2633.0
  Core	 : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT

Method Job Runtime Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
Default Clr Clr 143.72 ns 2.896 ns 3.448 ns 1.00 0.00 0.0455 144 B
New Clr Clr 123.75 ns 2.683 ns 5.945 ns 0.86 0.05 0.0455 144 B
Default Core Core 107.76 ns 2.183 ns 2.144 ns 1.00 0.00 0.0457 144 B
New Core Core 82.49 ns 1.721 ns 2.048 ns 0.77 0.02 0.0228 72 B

Encode Data

BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.309)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=2742189 Hz, Resolution=364.6722 ns, Timer=TSC
.NET Core SDK=2.1.300-preview3-008384
  [Host] : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT
  Clr	 : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2633.0
  Core	 : .NET Core 2.1.0-preview2-26313-01 (CoreCLR 4.6.26310.01, CoreFX 4.6.26313.01), 64bit RyuJIT

Method Job Runtime DataSize Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
Default Clr Clr 10 99.66 ns 2.085 ns 4.259 ns 1.00 0.00 0.0355 112 B
New Clr Clr 10 96.82 ns 1.934 ns 1.986 ns 0.97 0.04 0.0355 112 B
Default Core Core 10 94.31 ns 1.966 ns 5.178 ns 1.00 0.00 0.0355 112 B
New Core Core 10 85.06 ns 1.768 ns 2.479 ns 0.90 0.05 0.0178 56 B
Default Clr Clr 50 292.39 ns 1.660 ns 1.472 ns 1.00 0.00 0.1016 320 B
New Clr Clr 50 242.92 ns 4.917 ns 10.897 ns 0.83 0.04 0.1016 320 B
Default Core Core 50 240.90 ns 4.883 ns 4.796 ns 1.00 0.00 0.1016 320 B
New Core Core 50 151.45 ns 3.082 ns 5.479 ns 0.63 0.03 0.0508 160 B
Default Clr Clr 100 557.73 ns 10.927 ns 13.008 ns 1.00 0.00 0.1879 592 B
New Clr Clr 100 431.71 ns 7.723 ns 6.846 ns 0.77 0.02 0.1879 592 B
Default Core Core 100 467.78 ns 9.396 ns 19.614 ns 1.00 0.00 0.1879 592 B
New Core Core 100 326.64 ns 6.474 ns 5.739 ns 0.70 0.03 0.0939 296 B

Notes

Prospects

There could be a ~20% perf-win, when the base64-encoding / -decoding step would convert directly to / from url-encoded form by changing the base64-character map.
But by doing so we would "duplicate" the methods from Base64-class. This PR does rely on the provided methods -- maybe a follow-up PR will handle this.

Private classes

In a normal project I would separate the private classes to their own files, but here the WebEncoders.cs is a linked file, so I stuffed everything in this file.

@gfoidl
Copy link
Member Author

gfoidl commented Apr 4, 2018

Copy link
Member Author

@gfoidl gfoidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes for review

@@ -22,6 +29,8 @@ namespace Microsoft.Extensions.Internal
#endif
static class WebEncoders
{
private const int MaxStackallocBytes = 256;
Copy link
Member Author

@gfoidl gfoidl Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below this limit local buffers will be stack allocated. Is this limit OK?

using System.Buffers.Text;

#if !NET461
using System.Numerics;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector<T> isn't part of net461, so I've #ifed it out. Is there a way to bring Vector<T> to net461 in this project (linked file) or shouldn't we don't care about this 😉

}

Span<char> base64 = stackalloc char[base64Len];
Convert.TryToBase64Chars(data, base64, out int written);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert.TryToBase64Chars is noticeable faster than Base64.EncodeToUtf8. Maybe because base64-encoding and byte -> char is done in one step.

Interestingly on the decoding-methods the Base64-methods are faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last I checked (back in October), the span APIs were on par or faster: dotnet/corefx#24888

@atsushikan, recently did a port of the optimizations to the Convert methods: dotnet/coreclr#17033

There shouldn't be much difference on either now (in the general, non-allocating cases). Can you share details on what type of input you are measuring performance on and maybe a stripped down repro? Otherwise, I will take a deeper look in a couple of days. Thanks for bringing it up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I had the wrong suspects. Sorry. When running a stripped down sequential repo perf is on par, as noticed by @ahsonkhan

In the vectorized version the difference shows up.

if (Vector.IsHardwareAccelerated && (int*)n >= (int*)Vector<T>.Count)

will make the difference.

For Base64.EncodeToUtf8 T will be byte, so on AVX2 Vector<T>.Count is 32.
For Convert.TryToBase64Chars T will be char (or to be precise ushort), hence Vector<T>.Count is 16.

For a Guid n is 24. So the Convert-version will urlencode vectorized, whereas the Base64-version will run sequential only.

}

private static int GetNumBase64PaddingCharsToAddForDecode(int inputLength)
// TODO: replace IntPtr and (int*) with nuint once available
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nuint isn't avaialbe, so it is a bit messy.

The other workaround we be defining an using alias, depending on an #if for the bitness. On this linked file I didn't know how to do this, so I used the IntPtr-approach.

The TODO comment is to remind me for the change once available.

[InlineData("a-b_c", "a+b/c==")]
[InlineData("a-b_c-d", "a+b/c+d=")]
[InlineData("abcd", "abcd")]
public void UrlDecode_ReturnsValid_Base64String(string text, string expectedValue)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed a safety net for the private class UrlEncoder, so these tests were added.
Is it OK to test implementation details or shall I remove these tests?

[InlineData("a+b/c== ", "a-b_c")]
[InlineData("a+b/c ", "a-b_c ")]
[InlineData("abcd", "abcd")]
public void UrlEncode_Replaces_UrlEncodableCharacters(string base64EncodedValue, string expectedValue)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

/// <summary>
/// Invalid input, that doesn't conform a base64 string.
/// </summary>
internal static readonly string WebEncoders_InvalidInput = "The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has a nice TODO

// TODO using a resx file. project.json, unfortunately, fails to embed resx files when there are also compile items
// in the contentFiles section. Revisit once we convert repos to MSBuild

Shall I fix this -- or in a different PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do a different PR if it will be a noisy change.

@gfoidl
Copy link
Member Author

gfoidl commented Apr 4, 2018

For aspnet/SignalR#1606 (comment) I can prepare a PR (based on https://github.com/gfoidl/SignalR/commit/afb48976a468351b66f033306fef5654b3401252)

? GetBufferSizeRequiredToUrlDecode(base64Url.Length, out int dataLength)
: base64Url.Length;

if (base64Len > MaxStackallocBytes / sizeof(byte))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't sizeof(byte) 1? Why do we need this division?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is redundant, and just to have it the same as on the char-part. The JIT will treat it as constant anyway.


if (buffer.Length - bufferOffset < arraySizeRequired)
if ((uint)buffer.Length < (uint)(bufferOffset + base64Len))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any concerns here about integer overflow? bufferOffset + base64Len could be > int.MaxValue

Copy link
Member Author

@gfoidl gfoidl Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, the sum will be a negative value that is brought to very high values by the cast to uint, so the exception will be thrown. For me it seems OK. Am I wrong?

Change is motivated by replacing the subtraction with the addition.

Copy link
Member

@GrabYourPitchforks GrabYourPitchforks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits, other non-nit comments should be addressed.

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetArraySizeRequiredToEncode(int count)
{
if (count < 0 || count > MaxEncodedLength)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) This can be turned into a single check: if ((uint)count > (uint)MaxEncodedLength) { FAIL; }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 👍 thx!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that for this method no tests exists. Will add some.


#if NETCOREAPP2_1
private static int Base64UrlEncodeCore(ReadOnlySpan<byte> data, Span<char> base64Url, int base64Len)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method assumes that base64Len is exactly the number of characters required to represent the base64-encoded form of its input, not a single character more or less. It would be good to capture that invariant as a code comment and a Debug.Assert to prevent maintainers inadvertently misusing this method in the future.

(This comment applies to other methods in the same family as this one.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be updated.

}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static int GetNumBase64PaddingCharsToAddForDecode(int urlEncodedLen)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Potentially faster implementation (though untested):

int result = (4 - urlEncodedLen) & 3;
if (result == 3) { FAIL; }
return result;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing 😄

It's faster.

Method Mean Error StdDev Scaled ScaledSD
A 0.3861 ns 0.0086 ns 0.0076 ns 1.00 0.00
B 0.3436 ns 0.0114 ns 0.0107 ns 0.89 0.03


private static int GetBufferSizeRequiredToBase64Encode(int dataLength)
{
// overflow conditions are already eliminated, so 'checked' is not necessary
Copy link
Member

@GrabYourPitchforks GrabYourPitchforks Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This invariant isn't always true. See for instance the callers Base64UrlEncode(ReadOnlySpan<byte>, Span<char>), Base64UrlEncode(byte[], int, char[], int, int), and Base64UrlEncode(ReadOnlySpan<byte>).

Additionally, checked builds should validate the invariant via Debug.Assert(dataLength >= 0 && dataLength <= MaxEncodedLength);.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. I'll route these callers through the public GetArraySizeRequiredToEncode, so that the invariant will hold. Thx!

m = (IntPtr)((int)(int*)n & ~(Vector<T>.Count - 1));
for (; (int*)i < (int*)m; i += Vector<T>.Count)
{
var vec = Unsafe.As<T, Vector<T>>(ref Unsafe.Add(ref urlEncoded, i));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe.ReadUnaligned<Vector<T>>(...)

(This same comment applies to UrlEncode<T>.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update to Unsafe.ReadUnaligned<Vector<T>>(...), but can you please explain me why? I see the JIT produces exactely the same code. What am I missing here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It produces the same on x86 / x64 but may misbehave on other architectures with more strict alignment requirements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK. Thanks for the info!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadUnaligned accepts void* or ref byte:

var vec = Unsafe.ReadUnaligned<Vector<T>>(ref Unsafe.As<T, byte>(ref Unsafe.Add(ref urlEncoded, i)));

So there is also an endianess-concern (#334 (comment)). Is there a better way to do this? (and I need a break 😉 )

Copy link
Member

@GrabYourPitchforks GrabYourPitchforks Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the line Unsafe.ReadUnaligned<Vector<T>>(...) as you just wrote is endianness-safe. The reason is that the final pointer cast (before dereference) is of the expected type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. Essentially it is going from T -> Vector<T>.

So, for the same reason

Unsafe.WriteUnaligned(ref Unsafe.As<T, byte>(ref Unsafe.Add(ref urlEncoded, i)), vec);

is also endianess-safe. It is Vector<T> -> T.

Thank you for the infos!

where TIn : struct
where TOut : struct
{
var value = Unsafe.As<TIn, byte>(ref Unsafe.Add(ref urlEncoded, idx));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line assumes little-endian architecture. Better solution would be to get a local typed as TIn, then switch on TIn to forcibly convert it to byte.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change.

This line assumes little-endian architecture.

And I assumed endianess will be handled by Unsafe.As. Miss-assumption?!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect assumption. You're essentially performing reinterpret_cast<byte*> from a char* input, then dereferencing. The resulting value is endianness-dependent.

else if (typeof(TOut) == typeof(ushort))
{
Unsafe.Add(ref Unsafe.As<TOut, ushort>(ref base64), idx) = subst;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else
{
    throw new NotSupportedException(); // just in case new types are introduced in the future
}

(This comment applies everywhere we special-case a generic type as byte or ushort.)

subst = (byte)'/';
}

// With 'Unsafe.Add(ref output, idx) = Unsafe.As<byte, TOut>(ref subst);'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) We don't want the code suggested in this comment anyway, as you don't want to reinterpret the length of values on the stack. You could sweep up garbage data, and regardless it's not endianness-safe.

}
#endif
// n is always a multiple of 4
Debug.Assert((int)n == ((int)n & ~3));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Change this to Debug.Assert((int)n % 4 == 0) - it's clearer.

}

private static int GetNumBase64PaddingCharsToAddForDecode(int inputLength)
// TODO: replace IntPtr and (int*) with nuint once available
internal static unsafe class UrlEncoder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this type UrlCharacterSubstitution or something similar, and change the name of its methods. This type is not actually performing URL encoding, so the name is confusing.

@davidfowl
Copy link
Member

Thanks for reviewing @GrabYourPitchforks

public void UrlDecode_ReturnsValid_Base64String(string text, string expectedValue)
{
// To test the alternate code-path
#if NETCOREAPP2_1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ifdefed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netcoreapp2.1 runs this test via UrlEncoder.UrlDecode(byte, byte), the other two (netcoreapp2.0, net461) runs this test via UrlEncoder.UrlDecode(char, char).
So both entries to the UrlEncode-method get testes -- though depending on tfm.

Not good?

/// - NeedMoreData - only if isFinalBlock is false and the input is not a multiple of 4, otherwise the partial input would be considered as InvalidData
/// - InvalidData - if the input contains bytes outside of the expected base 64 range, or if it contains invalid/more than two padding characters,
/// or if the input is incomplete (i.e. not a multiple of 4) and isFinalBlock is true.</returns>
public static OperationStatus Base64UrlDecode(ReadOnlySpan<byte> base64Url, Span<byte> data, out int bytesConsumed, out int bytesWritten, bool isFinalBlock = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isFinalBlock? I haven't seen this is any APIs but the crypto ones, why is it here?

Copy link
Member

@GrabYourPitchforks GrabYourPitchforks Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base64 has always needed it since you need to know whether to emit the trailing = signs. Base64Url needs it because you need to know how to truncate the final 4-character block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using Microsoft.Extensions.WebEncoders.Sources;
using System.Buffers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sort usings

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done -- will be updated with the other feedback changes.

@GrabYourPitchforks
Copy link
Member

BTW, I should point out that the amount of unsafe code being introduced here makes me a bit nervous. It would be great if there were evidence that real-world applications - not just microbenchmarks - would benefit from this change.

@davidfowl
Copy link
Member

BTW, I should point out that the amount of unsafe code being introduced here makes me a bit nervous. It would be great if there were evidence that real-world applications - not just microbenchmarks - would benefit from this change.

We need to spanify all the things, what can we do to reduce the unsafe code. It's not just about throughput but allocations.

@gfoidl
Copy link
Member Author

gfoidl commented Apr 5, 2018

It's not just about throughput but allocations.

So shall I go with (easy) sequential code? Allocations are still minimized, throughput will increase especially on longer inputs.

It's quite a lot of "unsafeness" in there which I don't really like either. I wanted to squeeze everything out, because then going backward (i.e. removing some unsafe parts) seems easier to me than the other way round.

evidence that real-world applications

Thats nearly alway the point. I don't have any good context where this class is and will be used -- just assumed it as "low-level lib".
Do we have any more real world test-case for this?
If the unsafeness isn't it worth, then we shouldn't have it in the code and prefer easy to read and maintain alternatives.

@blowdart blowdart self-requested a review April 5, 2018 15:19
@blowdart
Copy link

blowdart commented Apr 5, 2018

Given the unsafe use and how this will accept untrusted input, this needs a lot of internal fuzz testing and other security reviews before merging.

@gfoidl
Copy link
Member Author

gfoidl commented Apr 7, 2018

So I believe this one will take longer 😉
https://github.com/dotnet/corefx/issues/28114#issuecomment-379410199 gives a clue that 2.1 will be near, and for aspnet/SignalR#1606 (comment) it would be nice to have at least

+public static string Base64UrlEncode(ReadOnlySpan<byte> data)

Shall I create a very trimmed down PR that just handles this method (so it can go for 2.1)?
This PR can be continued later.

@gfoidl
Copy link
Member Author

gfoidl commented Apr 16, 2018

#338 is a direct way of encoding, and -- at least for me -- a better solution than this PR. So I would close this PR in favor of #338.

@gfoidl
Copy link
Member Author

gfoidl commented Apr 27, 2018

Rebased due to conflict in Common.sln (Microsoft.AspNetCore.Analyzer.Testing.csproj added to dev, this PR adds Microsoft.Extensions.Internal.Benchmarks.csproj).

gfoidl added 11 commits June 18, 2018 19:53
* stack-allocs below a threshould
* uses ArrayPool<T> above the threshould (netcoreapp2.1) -- new T[] otherwise
* not all implemented
Only for netcoreapp2.1 -- rest will be done later.
Except
* some TODOs
* XML comments
@gfoidl
Copy link
Member Author

gfoidl commented Jun 18, 2018

Rebased due to

Conflicting files
Common.sln

@natemcmaster natemcmaster changed the base branch from dev to master July 2, 2018 17:44
@mkArtakMSFT
Copy link
Member

What is this pending for, @gfoidl, @davidfowl?

@gfoidl
Copy link
Member Author

gfoidl commented Oct 2, 2018

If we take #338, then we can close this one (would be my preference).

@davidfowl
Copy link
Member

OK lets close this one

@davidfowl davidfowl closed this Oct 3, 2018
@gfoidl gfoidl deleted the webencoders branch October 3, 2018 07:53
@ghost ghost locked as resolved and limited conversation to collaborators May 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants