Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Span support to Webencoders #334

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Final polishing
  • Loading branch information
gfoidl committed Jun 18, 2018
commit 6c2ea6ba350ad719ff4d6493093a155f74cae981
130 changes: 35 additions & 95 deletions shared/Microsoft.Extensions.WebEncoders.Sources/WebEncoders.cs
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ public static byte[] Base64UrlDecode(string input, int offset, int count)
}

/// <summary>
/// Decodes a base64url-encoded <paramref name="base64Url"/>.
/// Decodes a base64url-encoded span of chars.
/// </summary>
/// <param name="base64Url">The base64url-encoded input to decode.</param>
/// <returns>The base64url-decoded form of the input.</returns>
Expand All @@ -101,10 +101,10 @@ public static byte[] Base64UrlDecode(ReadOnlySpan<char> base64Url)
}

/// <summary>
/// Decodes a base64url-encoded <paramref name="base64Url"/> into a <paramref name="data"/>.
/// Decodes a base64url-encoded span of chars into a span of bytes.
/// </summary>
/// <param name="base64Url">A span containing the base64url-encoded input to decode.</param>
/// <param name="data">The base64url-decoded form of the <paramref name="base64Url"/>.</param>
/// <param name="data">The base64url-decoded form of <paramref name="base64Url"/>.</param>
/// <returns>The number of the bytes written to <paramref name="data"/>.</returns>
/// <remarks>
/// The input must not contain any whitespace or padding characters.
Expand Down Expand Up @@ -132,14 +132,14 @@ public static int Base64UrlDecode(ReadOnlySpan<char> base64Url, Span<byte> data)
/// <param name="data">The output span which contains the result of the operation, i.e. the decoded binary data.</param>
/// <param name="bytesConsumed">The number of input bytes consumed during the operation. This can be used to slice the input for subsequent calls, if necessary.</param>
/// <param name="bytesWritten">The number of bytes written into the output span. This can be used to slice the output for subsequent calls, if necessary.</param>
/// <param name="isFinalBlock">True (default) when the input span contains the entire data to decode.
/// <param name="isFinalBlock">True (default) when the input span contains the entire data to decode.
/// Set to false only if it is known that the input span contains partial data with more data to follow.</param>
/// <returns>It returns the OperationStatus enum values:
/// - Done - on successful processing of the entire input span
/// - DestinationTooSmall - if there is not enough space in the output span to fit the decoded input
/// - NeedMoreData - only if isFinalBlock is false and the input is not a multiple of 4, otherwise the partial input would be considered as InvalidData
/// - InvalidData - if the input contains bytes outside of the expected base 64 range, or if it contains invalid/more than two padding characters,
/// or if the input is incomplete (i.e. not a multiple of 4) and isFinalBlock is true.</returns>
/// or if the input is incomplete (i.e. not a multiple of 4) and isFinalBlock is true.</returns>
public static OperationStatus Base64UrlDecode(ReadOnlySpan<byte> base64Url, Span<byte> data, out int bytesConsumed, out int bytesWritten, bool isFinalBlock = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isFinalBlock? I haven't seen this is any APIs but the crypto ones, why is it here?

Copy link
Member

@GrabYourPitchforks GrabYourPitchforks Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base64 has always needed it since you need to know whether to emit the trailing = signs. Base64Url needs it because you need to know how to truncate the final 4-character block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
// Special-case empty input
Expand All @@ -150,7 +150,6 @@ public static OperationStatus Base64UrlDecode(ReadOnlySpan<byte> base64Url, Span
return OperationStatus.Done;
}

// Create array large enough for the Base64 characters, not just shorter Base64-URL-encoded form.
var base64Len = isFinalBlock
? GetBufferSizeRequiredToUrlDecode(base64Url.Length, out int dataLength)
: base64Url.Length;
Expand Down Expand Up @@ -200,8 +199,6 @@ public static byte[] Base64UrlDecode(string input, int offset, char[] buffer, in
return EmptyBytes;
}

// Assumption: input is base64url encoded without padding and contains no whitespace.

var base64Len = GetBufferSizeRequiredToUrlDecode(count, out int dataLength);

if ((uint)buffer.Length < (uint)(bufferOffset + base64Len))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any concerns here about integer overflow? bufferOffset + base64Len could be > int.MaxValue

Copy link
Member Author

@gfoidl gfoidl Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, the sum will be a negative value that is brought to very high values by the cast to uint, so the exception will be thrown. For me it seems OK. Am I wrong?

Change is motivated by replacing the subtraction with the addition.

Expand All @@ -228,48 +225,14 @@ public static byte[] Base64UrlDecode(string input, int offset, char[] buffer, in
/// </summary>
/// <param name="input">The binary input to encode.</param>
/// <returns>The base64url-encoded form of <paramref name="input"/>.</returns>
public static unsafe string Base64UrlEncode(byte[] input)
public static string Base64UrlEncode(byte[] input)
{
if (input == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
}

#if NETCOREAPP2_1
return Base64UrlEncode(input.AsSpan());
#else
var base64Len = GetBufferSizeRequiredToBase64Encode(input.Length, out int numPaddingChars);
#if !NET461
char[] arrayToReturnToPool = null;
try
{
#endif
var base64Chars =
#if NET461
new char[base64Len];
#else
// For small buffers allocate, otherwise rent from ArrayPool
base64Len <= 1_000 / sizeof(char)
? new char[base64Len]
: arrayToReturnToPool = ArrayPool<char>.Shared.Rent(base64Len);
#endif
var urlEncodedLen = Base64UrlEncodeCore(input, 0, input.Length, base64Chars, 0);

fixed (char* ptr = base64Chars)
{
return new string(ptr, 0, urlEncodedLen);
}
#if !NET461
}
finally
{
if (arrayToReturnToPool != null)
{
ArrayPool<char>.Shared.Return(arrayToReturnToPool);
}
}
#endif
#endif
}

/// <summary>
Expand All @@ -288,21 +251,7 @@ public static string Base64UrlEncode(byte[] input, int offset, int count)
ThrowInvalidArguments(input, offset, count);
}

#if NETCOREAPP2_1
return Base64UrlEncode(input.AsSpan(offset, count));
#else
// Special-case empty input
if (count == 0)
{
return string.Empty;
}

var arraySizeRequired = GetBufferSizeRequiredToBase64Encode(count);
var buffer = new char[arraySizeRequired];
var numBase64Chars = Base64UrlEncodeCore(input, offset, count, buffer, 0);

return new string(buffer, 0, numBase64Chars);
#endif
}

/// <summary>
Expand Down Expand Up @@ -335,15 +284,16 @@ public static unsafe string Base64UrlEncode(ReadOnlySpan<byte> data)
try
{
#endif
var bufferLen = base64Len - numPaddingChars;
var base64Url = bufferLen <= MaxStackallocBytes / sizeof(char)
? stackalloc char[bufferLen]
var base64UrlLen = base64Len - numPaddingChars;
var base64Url = base64UrlLen <= MaxStackallocBytes / sizeof(char)
? stackalloc char[base64UrlLen]
#if NET461
: new char[bufferLen];
: new char[base64UrlLen];
#else
: arrayToReturnToPool = ArrayPool<char>.Shared.Rent(bufferLen);
: arrayToReturnToPool = ArrayPool<char>.Shared.Rent(base64UrlLen);
#endif
var urlEncodedLen = Base64UrlEncodeCore(data, base64Url, base64Len);
Debug.Assert(base64UrlLen == urlEncodedLen);

fixed (char* ptr = &MemoryMarshal.GetReference(base64Url))
{
Expand Down Expand Up @@ -378,8 +328,8 @@ public static int Base64UrlEncode(ReadOnlySpan<byte> data, Span<char> base64Url)
return 0;
}

var bufferSizeRequired = GetBufferSizeRequiredToBase64Encode(data.Length);
return Base64UrlEncodeCore(data, base64Url, bufferSizeRequired);
var base64Len = GetBufferSizeRequiredToBase64Encode(data.Length);
return Base64UrlEncodeCore(data, base64Url, base64Len);
}

/// <summary>
Expand All @@ -392,7 +342,7 @@ public static int Base64UrlEncode(ReadOnlySpan<byte> data, Span<char> base64Url)
/// </param>
/// <param name="bytesConsumed">The number of input bytes consumed during the operation. This can be used to slice the input for subsequent calls, if necessary.</param>
/// <param name="bytesWritten">The number of bytes written into the output span. This can be used to slice the output for subsequent calls, if necessary.</param>
/// <param name="isFinalBlock">True (default) when the input span contains the entire data to decode.
/// <param name="isFinalBlock">True (default) when the input span contains the entire data to decode.
/// Set to false only if it is known that the input span contains partial data with more data to follow.</param>
/// <returns>It returns the OperationStatus enum values:
/// - Done - on successful processing of the entire input span
Expand All @@ -413,8 +363,7 @@ public static OperationStatus Base64UrlEncode(ReadOnlySpan<byte> data, Span<byte

if (status == OperationStatus.Done || status == OperationStatus.NeedMoreData)
{
var span = base64Url.Slice(0, bytesWritten);
bytesWritten = EncodingHelper.UrlEncode(span);
bytesWritten = EncodingHelper.UrlEncode(base64Url, bytesWritten);
}

return status;
Expand Down Expand Up @@ -465,13 +414,16 @@ public static int Base64UrlEncode(byte[] input, int offset, char[] output, int o
}

/// <summary>
/// Gets the minimum <c>char[]</c> size required for decoding of <paramref name="count"/> characters
/// with the <see cref="Base64UrlDecode(string, int, char[], int, int)"/> method.
/// Gets the minimum buffer size required for decoding of <paramref name="count"/> characters.
/// </summary>
/// <param name="count">The number of characters to decode.</param>
/// <returns>
/// The minimum <c>char[]</c> size required for decoding of <paramref name="count"/> characters.
/// The minimum buffer size required for decoding of <paramref name="count"/> characters.
/// </returns>
/// <remarks>
/// The returned buffer size is large enough to hold <paramref name="count"/> characters as well
/// as base64 padding characters.
/// </remarks>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetArraySizeRequiredToDecode(int count)
{
Expand All @@ -484,13 +436,16 @@ public static int GetArraySizeRequiredToDecode(int count)
}

/// <summary>
/// Get the minimum output <c>char[]</c> size required for encoding <paramref name="count"/>
/// <see cref="byte"/>s with the <see cref="Base64UrlEncode(byte[], int, char[], int, int)"/> method.
/// Gets the minimum output buffer size required for encoding <paramref name="count"/> bytes.
/// </summary>
/// <param name="count">The number of characters to encode.</param>
/// <returns>
/// The minimum output <c>char[]</c> size required for encoding <paramref name="count"/> <see cref="byte"/>s.
/// The minimum output buffer size required for encoding <paramref name="count"/> <see cref="byte"/>s.
/// </returns>
/// <remarks>
/// The returned buffer size is large enough to hold <paramref name="count"/> bytes as well
/// as base64 padding characters.
/// </remarks>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetArraySizeRequiredToEncode(int count)
{
Expand Down Expand Up @@ -590,18 +545,6 @@ private static OperationStatus Base64UrlDecodeCore(ReadOnlySpan<byte> base64Url,
return status;
}

#if !NETCOREAPP2_1
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static int Base64UrlEncodeCore(byte[] input, int offset, int count, char[] buffer, int bufferOffset)
{
// Start with default Base64 encoding.
var numBase64Chars = Convert.ToBase64CharArray(input, offset, count, buffer, bufferOffset);
var span = buffer.AsSpan(bufferOffset, numBase64Chars);
numBase64Chars = EncodingHelper.UrlEncode(span, span);
return numBase64Chars;
}
#endif

#if NETCOREAPP2_1
private static int Base64UrlEncodeCore(ReadOnlySpan<byte> data, Span<char> base64Url, int base64Len)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method assumes that base64Len is exactly the number of characters required to represent the base64-encoded form of its input, not a single character more or less. It would be good to capture that invariant as a code comment and a Debug.Assert to prevent maintainers inadvertently misusing this method in the future.

(This comment applies to other methods in the same family as this one.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be updated.

Expand Down Expand Up @@ -781,6 +724,7 @@ Exception GetInvalidArgumentsException()
}

// internal to make this testable
// TODO: replace IntPtr and (int*) with nuint once available
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nuint isn't avaialbe, so it is a bit messy.

The other workaround we be defining an using alias, depending on an #if for the bitness. On this linked file I didn't know how to do this, so I used the IntPtr-approach.

The TODO comment is to remind me for the change once available.

internal static unsafe class EncodingHelper
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
Expand All @@ -806,8 +750,7 @@ private static void UrlDecode<T>(ref T urlEncoded, ref T base64, int urlEncodedL
{
// Copy input into base64, fixing up '-' -> '+' and '_' -> '/' and add padding.

// TODO: replace with nuint once available
var i = (IntPtr)0;
var i = (IntPtr)0; // Use IntPtr for arithmetic to avoid unnecessary 64->32->64 truncations
var n = (IntPtr)urlEncodedLen;
var m = i;
#if !NET461
Expand Down Expand Up @@ -873,8 +816,7 @@ public static void UrlDecode(ReadOnlySpan<char> urlEncoded, Span<byte> base64)
ref var input = ref MemoryMarshal.GetReference(urlEncoded);
ref var output = ref MemoryMarshal.GetReference(base64);

// TODO: replace with nuint once available
var i = (IntPtr)0;
var i = (IntPtr)0; // Use IntPtr for arithmetic to avoid unnecessary 64->32->64 truncations
var n = (IntPtr)urlEncoded.Length;
var m = i;
#if !NET461
Expand Down Expand Up @@ -925,11 +867,11 @@ public static void UrlDecode(ReadOnlySpan<char> urlEncoded, Span<byte> base64)
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int UrlEncode(Span<byte> base64)
public static int UrlEncode(Span<byte> base64, int count)
{
ref var r = ref MemoryMarshal.GetReference(base64);

return UrlEncode(ref r, ref r, base64.Length);
return UrlEncode(ref r, ref r, count);
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
Expand All @@ -946,7 +888,6 @@ private static int UrlEncode<T>(ref T base64, ref T urlEncoded, int base64Length
{
// Use base64url encoding with no padding characters. See RFC 4648, Sec. 5.

// TODO: replace with nuint once available
var i = (IntPtr)0; // Use IntPtr for arithmetic to avoid unnecessary 64->32->64 truncations
var n = (IntPtr)base64Length;
var m = i;
Expand Down Expand Up @@ -1010,7 +951,6 @@ public static int UrlEncode(ReadOnlySpan<byte> base64, Span<char> urlEncoded)
#if !NET461
if (Vector.IsHardwareAccelerated && (int*)n >= (int*)Vector<byte>.Count)
{
// TODO: replace with nuint once available
m = (IntPtr)((int)(int*)n & ~(Vector<byte>.Count - 1));
for (; (int*)i < (int*)m; i += Vector<byte>.Count)
{
Expand Down Expand Up @@ -1074,7 +1014,7 @@ private static void UrlDecode<TIn, TOut>(ref TIn urlEncoded, ref TOut base64, In

// With 'Unsafe.Add(ref output, idx) = Unsafe.As<byte, TOut>(ref subst);'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) We don't want the code suggested in this comment anyway, as you don't want to reinterpret the length of values on the stack. You could sweep up garbage data, and regardless it's not endianness-safe.

// the JIT will push subst to the stack, instead of using registers.
// Therefore this 'if' is used. The JIT will eliminate at compile-time
// Therefore this 'if' is used. The JIT will eliminate at compile-time
// the not taken branch.
if (typeof(TOut) == typeof(byte))
{
Expand Down Expand Up @@ -1122,7 +1062,7 @@ private static bool UrlEncode<TIn, TOut>(ref TIn base64, ref TOut urlEncoded, In

// With 'Unsafe.Add(ref output, idx) = Unsafe.As<byte, TOut>(ref subst);'
// the JIT will push subst to the stack, instead of using registers.
// Therefore this 'if' is used. The JIT will eliminate at compile-time
// Therefore this 'if' is used. The JIT will eliminate at compile-time
// the not taken branch.
if (typeof(TOut) == typeof(byte))
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,7 @@ public void UrlEncode_Replaces_UrlEncodableCharacters(string base64EncodedValue,
Encoding.ASCII.GetBytes(expectedValue.AsSpan(), expected);

// Act
var result = WebEncoders.EncodingHelper.UrlEncode(bytes);
var result = WebEncoders.EncodingHelper.UrlEncode(bytes, bytes.Length);

// Assert
Assert.True(expected.SequenceEqual(bytes.Slice(0, result)));
Expand Down