Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Add Span<T> Base64 conversion APIs that support UTF-8 #24888

Merged
merged 7 commits into from
Oct 27, 2017

Conversation

ahsonkhan
Copy link
Member

@ahsonkhan ahsonkhan commented Oct 26, 2017

Fixes https://github.com/dotnet/corefx/issues/24568 (part of https://github.com/dotnet/corefx/issues/24174).

namespace System.Buffers.Text {
    public static class Base64 {
        public static OperationStatus DecodeFromUtf8(ReadOnlySpan<byte> utf8, Span<byte> bytes, out int consumed, out int written, bool isFinalBlock=true);
        public static OperationStatus DecodeFromUtf8InPlace(Span<byte> buffer, out int written);
        public static OperationStatus EncodeToUtf8(ReadOnlySpan<byte> bytes, Span<byte> utf8, out int consumed, out int written, bool isFinalBlock=true);
        public static OperationStatus EncodeToUtf8InPlace(Span<byte> buffer, int dataLength, out int written);
        public static int GetMaxDecodedFromUtf8Length(int length);
        public static int GetMaxEncodedToUtf8Length(int length);
    }
}

namespace System.Buffers {
    public enum OperationStatus {
        DestinationTooSmall = 1,
        Done = 0,
        InvalidData = 3,
        NeedMoreData = 2,
    }
}

Note:
OperationStatus (previously TransformationStatus) can be updated base on API review and doesn't need to block this PR (https://github.com/dotnet/corefx/issues/22412).

Code Coverage:
100% line and branch coverage
image

Performance Test Results (duration):

Test Name AVERAGE
Base64Decode(numberOfBytes: 10) 0.019
Base64Decode(numberOfBytes: 100) 0.110
Base64Decode(numberOfBytes: 1000) 0.950
Base64Decode(numberOfBytes: 1000000) 954.361
Base64DecodeBaseline(numberOfBytes: 10) 0.072
Base64DecodeBaseline(numberOfBytes: 100) 0.420
Base64DecodeBaseline(numberOfBytes: 1000) 3.819
Base64DecodeBaseline(numberOfBytes: 1000000) 8529.559
Base64DecodeDetinationTooSmall(numberOfBytes: 10) 0.018
Base64DecodeDetinationTooSmall(numberOfBytes: 100) 0.115
Base64DecodeDetinationTooSmall(numberOfBytes: 1000) 0.948
Base64DecodeDetinationTooSmall(numberOfBytes: 1000000) 954.096
Base64DecodeInPlace(numberOfBytes: 10) 0.020
Base64DecodeInPlace(numberOfBytes: 100) 0.113
Base64DecodeInPlace(numberOfBytes: 1000) 0.981
Base64DecodeInPlace(numberOfBytes: 1000000) 1023.155
Base64DecodeInPlaceOnce(numberOfBytes: 1000000) 0.950
Base64Encode(numberOfBytes: 10) 0.020
Base64Encode(numberOfBytes: 100) 0.129
Base64Encode(numberOfBytes: 1000) 1.158
Base64Encode(numberOfBytes: 1000000) 1170.335
Base64EncodeBaseline(numberOfBytes: 10) 0.029
Base64EncodeBaseline(numberOfBytes: 100) 0.145
Base64EncodeBaseline(numberOfBytes: 1000) 1.351
Base64EncodeBaseline(numberOfBytes: 1000000) 1303.068
Base64EncodeDestinationTooSmall(numberOfBytes: 10) 0.018
Base64EncodeDestinationTooSmall(numberOfBytes: 100) 0.128
Base64EncodeDestinationTooSmall(numberOfBytes: 1000) 1.162
Base64EncodeDestinationTooSmall(numberOfBytes: 1000000) 1163.685
Base64EncodeInPlace(numberOfBytes: 10) 0.023
Base64EncodeInPlace(numberOfBytes: 100) 0.133
Base64EncodeInPlace(numberOfBytes: 1000) 1.171
Base64EncodeInPlace(numberOfBytes: 1000000) 1247.834
Base64EncodeInPlaceOnce(numberOfBytes: 1000000) 1.176
Out-dated performance results 1
Test Name AVERAGE
Base64Decode(numberOfBytes: 10) 0.018
Base64Decode(numberOfBytes: 100) 0.113
Base64Decode(numberOfBytes: 1000) 1.005
Base64Decode(numberOfBytes: 1000000) 1011.903
Base64DecodeBaseline(numberOfBytes: 10) 0.083
Base64DecodeBaseline(numberOfBytes: 100) 0.452
Base64DecodeBaseline(numberOfBytes: 1000) 4.077
Base64DecodeBaseline(numberOfBytes: 1000000) 8977.842
Base64DecodeInPlace(numberOfBytes: 10) 0.022
Base64DecodeInPlace(numberOfBytes: 100) 0.114
Base64DecodeInPlace(numberOfBytes: 1000) 0.983
Base64DecodeInPlace(numberOfBytes: 1000000) 1055.159
Base64DecodeInPlaceOnce(numberOfBytes: 1000000) 0.985
Base64Encode(numberOfBytes: 10) 0.024
Base64Encode(numberOfBytes: 100) 0.171
Base64Encode(numberOfBytes: 1000) 1.631
Base64Encode(numberOfBytes: 1000000) 1659.814
Base64EncodeBaseline(numberOfBytes: 10) 0.034
Base64EncodeBaseline(numberOfBytes: 100) 0.149
Base64EncodeBaseline(numberOfBytes: 1000) 1.342
Base64EncodeBaseline(numberOfBytes: 1000000) 1340.315
Base64EncodeInPlace(numberOfBytes: 10) 0.068
Base64EncodeInPlace(numberOfBytes: 100) 0.406
Base64EncodeInPlace(numberOfBytes: 1000) 3.782
Base64EncodeInPlace(numberOfBytes: 1000000) 3983.673
Base64EncodeInPlaceOnce(numberOfBytes: 1000000) 3.854
Out-dated performance results 2
Test Name AVERAGE
Base64Decode(numberOfBytes: 10) 0.019
Base64Decode(numberOfBytes: 100) 0.118
Base64Decode(numberOfBytes: 1000) 1.001
Base64Decode(numberOfBytes: 1000000) 1000.071
Base64DecodeBaseline(numberOfBytes: 10) 0.075
Base64DecodeBaseline(numberOfBytes: 100) 0.428
Base64DecodeBaseline(numberOfBytes: 1000) 3.872
Base64DecodeBaseline(numberOfBytes: 1000000) 8725.017
Base64DecodeInPlace(numberOfBytes: 10) 0.022
Base64DecodeInPlace(numberOfBytes: 100) 0.117
Base64DecodeInPlace(numberOfBytes: 1000) 0.986
Base64DecodeInPlace(numberOfBytes: 1000000) 1068.851
Base64DecodeInPlaceOnce(numberOfBytes: 1000000) 0.993
Base64Encode(numberOfBytes: 10) 0.023
Base64Encode(numberOfBytes: 100) 0.159
Base64Encode(numberOfBytes: 1000) 1.437
Base64Encode(numberOfBytes: 1000000) 1473.720
Base64EncodeBaseline(numberOfBytes: 10) 0.036
Base64EncodeBaseline(numberOfBytes: 100) 0.149
Base64EncodeBaseline(numberOfBytes: 1000) 1.602
Base64EncodeBaseline(numberOfBytes: 1000000) 1357.203
Base64EncodeInPlace(numberOfBytes: 10) 0.027
Base64EncodeInPlace(numberOfBytes: 100) 0.157
Base64EncodeInPlace(numberOfBytes: 1000) 1.399
Base64EncodeInPlace(numberOfBytes: 1000000) 1466.058
Base64EncodeInPlaceOnce(numberOfBytes: 1000000) 1.402
Out-dated performance results 3
Test Name AVERAGE
Base64Decode(numberOfBytes: 10) 0.017
Base64Decode(numberOfBytes: 100) 0.110
Base64Decode(numberOfBytes: 1000) 0.984
Base64Decode(numberOfBytes: 1000000) 1007.293
Base64DecodeBaseline(numberOfBytes: 10) 0.069
Base64DecodeBaseline(numberOfBytes: 100) 0.410
Base64DecodeBaseline(numberOfBytes: 1000) 3.941
Base64DecodeBaseline(numberOfBytes: 1000000) 8593.280
Base64DecodeInPlace(numberOfBytes: 10) 0.021
Base64DecodeInPlace(numberOfBytes: 100) 0.116
Base64DecodeInPlace(numberOfBytes: 1000) 1.008
Base64DecodeInPlace(numberOfBytes: 1000000) 1045.012
Base64DecodeInPlaceOnce(numberOfBytes: 1000000) 0.955
Base64Encode(numberOfBytes: 10) 0.020
Base64Encode(numberOfBytes: 100) 0.132
Base64Encode(numberOfBytes: 1000) 1.585
Base64Encode(numberOfBytes: 1000000) 1169.956
Base64EncodeBaseline(numberOfBytes: 10) 0.029
Base64EncodeBaseline(numberOfBytes: 100) 0.153
Base64EncodeBaseline(numberOfBytes: 1000) 1.293
Base64EncodeBaseline(numberOfBytes: 1000000) 1316.709
Base64EncodeInPlace(numberOfBytes: 10) 0.023
Base64EncodeInPlace(numberOfBytes: 100) 0.158
Base64EncodeInPlace(numberOfBytes: 1000) 1.207
Base64EncodeInPlace(numberOfBytes: 1000000) 1252.556
Base64EncodeInPlaceOnce(numberOfBytes: 1000000) 1.166

Encode is about 10% faster 20% slower 10% slower (for input larger than 100 bytes). Decode is at least 4x faster (even when compared to the new non-allocating variant TryFromBase64Chars). This is probably because the Convert.FromBase64CharArray allocates (approximately n bytes, where n is input size + around 24 byte overhead).

Edit 0: Made encode a bit faster for the common case.
Edit 1: Compared against TryFromBase64Chars, there is still a large perf difference.
Edit 2: Added new performance numbers based on additional optimizations suggested by @jkotas.
Edit 3: Added performance tests for when destination buffer is too small.

cc @KrzysztofCwalina, @stephentoub, @GrabYourPitchforks, @jkotas, @dotnet/corefxlab-contrib

@jkotas
Copy link
Member

jkotas commented Oct 26, 2017

Convert.FromBase64CharArray allocates

It would be more fair to use the non-allocating variant.

@ahsonkhan
Copy link
Member Author

ahsonkhan commented Oct 26, 2017

It would be more fair to use the non-allocating variant.

I couldn't find any non-allocating public API for decoding (https://msdn.microsoft.com/en-us/library/system.convert_methods(v=vs.110).aspx). Both of these return a byte[]:

  • FromBase64CharArray
  • FromBase64String

@jkotas
Copy link
Member

jkotas commented Oct 26, 2017

The current .NET Core has TryFromBase64Chars

@ahsonkhan
Copy link
Member Author

The current .NET Core has TryFromBase64Chars

I updated the performance test. Although there wasn't much difference in the results.

@stephentoub
Copy link
Member

Decode is at least 4x faster (even when compared to the new non-allocating variant TryFromBase64Chars

Then can you also fix the baseline? We shouldn't be happy about such a difference. We should be happy that we've found ways to improve the existing one and then do it. I can believe there could be a small difference due to fundamental things like outputting 1 byte vs 2 bytes per encoded unit, but it shouldn't be anywhere near that big.

Success isn't making the new thing faster. Success is making the new thing fast and then making the existing one as fast or as close to as fast as is possible.

int destIndex = 0;
int result = 0;

if (destLength >= GetMaxEncodedToUtf8Length(srcLength))
Copy link
Member

@jkotas jkotas Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw when the source is too big. Should it rather return OperationStatus.DestinationTooSmall ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that. The issue with DestinationTooSmall is the user will then expect to enlarge the output buffer and try again, and for any buffer size up to int.MaxValue, they won't succeed for the current input length.

However, if they could continue to slice the input as it is partially consumed, then they would eventually succeed (or they could get unmanaged memory using AllocHGlobal). So, I will update the behavior as per your suggestion.

var encodedLength = GetMaxEncodedToUtf8Length(dataLength);
if (buffer.Length < encodedLength) goto FalseExit;

var leftover = dataLength - dataLength / 3 * 3; // how many bytes after packs of 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It is not ok to use var here per corefx coding conventions.


[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static int Encode(ref byte threeBytes)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use same set of tricks here as you are using for Decode: cache the map in a local upfront, avoid bound checks for it and save some binary ops by operating on the whole byte. Something like:

private static int Encode(ref byte threeBytes, ref byte encodingMap)
{
    int i = (threeBytes << 16) | (Unsafe.Add(ref threeBytes, 1) << 8) | Unsafe.Add(ref threeBytes, 2);

    int i0 = Unsafe.Add(ref encodingMap, i >> 18)
    int i1 = Unsafe.Add(ref encodingMap, (i >> 12) & 0x3F);
    int i2 = Unsafe.Add(ref encodingMap, (i >> 6) & 0x3F);
    int i3 = Unsafe.Add(ref encodingMap, i & 0x3F);

    return i0 | (i1 << 8) | (i2 << 16) | (i3 << 24);
}

52, 53, 54, 55, 56, 57, 43, 47 //4..9, +, /
};

const byte s_encodingPad = (byte)'='; // '=', for padding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s_ is not correct prefix for consts per corefx coding conventions

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetMaxDecodedFromUtf8Length(int length)
{
Debug.Assert(length >= 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we like debug assert here? Should we return 0 or throw?

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetMaxEncodedToUtf8Length(int length)
{
Debug.Assert(length >= 0);
Copy link
Member

@KrzysztofCwalina KrzysztofCwalina Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, i.e. I think we should throw for negative lengths. If you do decide to throw, throw from a helper to allow inlining.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Public members need to either accept their inputs (no assert, no check) or properly validate (check and throw). Asserting on inputs is only valid for non-public members, to guard that the public member already did the validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care what exception type gets thrown? Personally I don't care if it's ArgumentException or OverflowException, or even if it's potentially both (negative vs. too large). It's not like the developer can catch and handle it anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArgumentOutOfRangeException seems to match the problem space to me. (Well, really, argument out of domain (since it's an input); but that's just my math degree talking)

@ahsonkhan
Copy link
Member Author

Then can you also fix the baseline? We shouldn't be happy about such a difference.

I agree. Filed an issue to look into this. Ty.

@@ -106,12 +106,12 @@ public static T ReadMachineEndian<T>(ReadOnlySpan<byte> buffer)
#else
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
{
throw new ArgumentException(SR.Format(SR.Argument_InvalidTypeWithPointersNotSupported, typeof(T)));
ThrowHelper.ThrowArgumentException_InvalidTypeWithPointersNotSupported(typeof(T));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Member Author

@ahsonkhan ahsonkhan Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From @KrzysztofCwalina

If you do decide to throw, throw from a helper to allow inlining.

Previously, ThrowHelper.cs was not included for netcoreapp, but I updated the project file to always include ThrowHelper.cs. It is the same exception and message.

Copy link
Member

@jkotas jkotas Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file should have zero effect on the inlining or the native code that will be generated. I do not see a problem with, but I do not think they help or hurt anything.


ref byte encodingMap = ref s_encodingMap[0];

if (srcLength <= MaximumEncodeLength && destLength >= GetMaxEncodedToUtf8Length(srcLength))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to cap the srcLength to whatever fits instead of having two copies of the encoding loop? It would avoid the penalty for the case where the destination buffer does not fit, and make the code quite a bit smaller (note that you get pretty non-trivial code expansion here because of the use of the AggressiveInlining attributes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same in the decoder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I was tinkering with that but was previously struggling to get the "cap" set correctly. But I believe I have figured it out now!

}

// Pre-computing this table using a custom string(s_characters) and GenerateDecodingMapAndVerify (found in tests)
static readonly sbyte[] s_decodingMap = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private. corefx coding style is to be explicit with visibility.

}

// Pre-computing this table using a custom string(s_characters) and GenerateEncodingMapAndVerify (found in tests)
static readonly byte[] s_encodingMap = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private

@ahsonkhan
Copy link
Member Author

@dotnet-bot test OSX x64 Debug Build

@ahsonkhan ahsonkhan merged commit b7b3439 into dotnet:master Oct 27, 2017
@ahsonkhan ahsonkhan deleted the AddBase64APIs branch October 27, 2017 04:56
@karelz karelz added this to the 2.1.0 milestone Oct 28, 2017
pjanotti pushed a commit to pjanotti/corefx that referenced this pull request Oct 31, 2017
* Add Span<T> Base64 conversion APIs that support UTF-8.

* Optimize the encoding loop when there is plenty of available space

* Optimize EncodeInPlace and update DecodeBaseline perf test.

* Addressing PR feedback, encode optimization, throw for negative lengths

* Reenable commented out perf tests.

* Cap the amount of data to process based on how much that will fit.

* Being explicit with access modifiers to follow guidelines.
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…24888)

* Add Span<T> Base64 conversion APIs that support UTF-8.

* Optimize the encoding loop when there is plenty of available space

* Optimize EncodeInPlace and update DecodeBaseline perf test.

* Addressing PR feedback, encode optimization, throw for negative lengths

* Reenable commented out perf tests.

* Cap the amount of data to process based on how much that will fit.

* Being explicit with access modifiers to follow guidelines.


Commit migrated from dotnet/corefx@b7b3439
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants