-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Improved uninitialized object creation performance #98016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @dotnet/area-system-reflection Issue Details
|
f5c7618
to
071f79d
Compare
Improving performance of this API has dubious value. This API exists for backward compatibility only. We are not interested in promote its use. When we implemented the performance improvements for Activator.CreateInstance some time ago, we intentionally left this API out. |
src/coreclr/System.Private.CoreLib/src/System/RuntimeType.ActivatorCache.cs
Outdated
Show resolved
Hide resolved
Would help with perf on binary serializer we have that is performance critical. We need to support creating types without default ctors, and it works better without calling default ctors at all for a couple different reasons. |
We have multi-year project to eradicate binary serializers from .NET ecosystem due to security issues that they introduce. |
I know, was a bad design that couldn't be fixed without breaking it. Not an issue with ours. |
Well, all binary serializers that create uninitialized objects are questionable design. |
Respectfully, that's an overly broad generalization that doesn't always apply. The only time it creates uninitialized objects is when the object itself is aware that it will be created as uninitialized during deserialization and it provides the logic to use for initializing itself in that case. |
Perhaps I should clarify this statement:
It works better without calling default ctors at all on objects that provide deserialization behavior, which when using our lib, is most DTOs. |
@jkotas I guess I'll pause putting any more effort into this for now. Please let me know if there is any point in me continuing down this path or not. I don't want to spend more time on a PR that isn't going to get merged. |
If the type is explicitly designed for binary serializer, part of the design can be a constructor that prepares the instance for binary serialization. Empty or near empty constructor costs nothing given the total costs involved in reflection-based serialization. If the serializer serializes external type that is not explicitly designed for it, skipping the constructor is dangerous. I have spent a lot of time over the years debugging obscure crashes caused by binary serializers creating external uninitialized objects. One example from many: #67491 (comment) |
Calling a parameterized constructor on a generic I'm aware of the alternatives in this case, but they all pose usability issues, especially with NRTs. It's a bit of a unique design that has lots of advantages and great usability. I'm not keen on sacrificing that. |
I know, that's why we don't do that :) |
Ah, right, I also remembered another reason why the constructor approach won't actually work here at all - it would need to take a |
Note, I could get around the var pfnCtor = typeof(Test).GetConstructors()[0].MethodHandle.GetFunctionPointer();
var ctor = (SpanCtor<Test>)Activator.CreateInstance(typeof(SpanCtor<Test>), null, pfnCtor)!;
var t = (Test)RuntimeHelpers.GetUninitializedObject(typeof(Test));
ctor(t, "Test");
t.Dump();
delegate void SpanCtor<TObject>(TObject uninitialized, ReadOnlySpan<char> p);
class Test
{
public string Value;
public Test(ReadOnlySpan<char> p)
{
Value = p.ToString();
}
} |
I did a search across github for usages of
You might not want to "encourage" people to use it because it is "unsafe", but it's a tool that is useful in some situations when used carefully, just like all the other unsafe low-level things the runtime provides. The work to make it fast is basically done here, aside from some minor exception type fixes. The benefit is 15x perf improvement, the cost is 1 extra field in the activator cache. |
If you had the |
I'd be happy with that and |
You can fix this problem by requiring the types that want to participate in your serialization scheme to provide a factory method with a specific signature. The factory method invocation can be invoked via cached delegate. It will be much faster than |
I guess I can live with it if it is a nice clean change. |
I do not think that it just minor exception type fixes. There are other subtle differences between |
Actually no, not in this case (though the use-case being discussed here isn't my only usage of
Just FYI, it's not reflection-based, quite the opposite. If you want an example: Examplepublic abstract class Page : IBinaryProcessable
{
// Header:
//
// Size Type Field Description
// ---- ---- ----- -----------
// 8 ulong Transaction ID The ID of the transaction that last modified this page
// 4 uint Page number Physical page number in the file
// 1 byte Struct type The type of structure this page is a part of
// 1 byte Data type The kind of data the structure this page belongs to holds
// 1 byte Page level The page level in the tree (zero for leaf level pages)
// 2 ushort Owner ID The ID of the schema object that owns this page
// 2 ushort Free space The amount of free space left on the page
// 4 uint Previous page Previous page number at the same level
// 4 uint Next page Next page number at the same level
// 2 ushort Active start offset Active page segment index offset from start
// 2 ushort Active end offset Active page segment index offset from end
// 33 n/a Reserved Reserved for future use
// ----
// 64
/// <summary>
/// Size of the page header.
/// </summary>
public const int HeaderSize = 64;
/// <summary>
/// The number of reserved bytes at the end of the header
/// </summary>
private const int HeaderReservedBytes = 33;
// Footer:
//
// Size TypeOf Field Description
// ---- ---- ----- -----------
// 4 int Checksum CRC32 Checksum to detect corruption
// 8 ulong Transaction ID Repeated transaction ID to detect tearing on writes
// ----
// 12
/// <summary>
/// Size of the page footer.
/// </summary>
public const int FooterSize = 12;
// Footer offsets are measured from the end of the page
private const int FooterChecksumOffset = 12;
private const int FooterTransactionIdOffset = 8;
// Persisted fields
private long _transactionId;
private int _pageNumber;
private PageStructType _structType;
private PageDataType _dataType;
private byte _level;
private short _ownerId;
private short _freeEntrySpace;
private int _previousPageNumber;
private int _nextPageNumber;
private short _activeStartOffset = short.MaxValue;
private short _activeEndOffset = short.MaxValue;
/// <summary>
/// Initializes a new page instance.
/// </summary>
protected Page(long transactionId, int pageNumber, PageStructType structType, PageDataType dataType, short ownerId, byte level)
{
_transactionId = transactionId;
_pageNumber = pageNumber;
_structType = structType;
_dataType = dataType;
_ownerId = ownerId;
_level = level;
}
void IBinaryProcessable.Process<TByteOrder>(ref BinaryProcessor<TByteOrder> processor)
{
ProcessHeader(ref processor);
ProcessEntries(ref processor);
ProcessChecksum(ref processor);
}
protected abstract void ProcessEntries<TByteOrder>(ref BinaryProcessor<TByteOrder> processor) where TByteOrder : IByteOrder;
/// <summary>
/// Processes the header and the verification transaction ID used to detect page tears.
/// The checksum is NOT processed - ProcessChecksum() must be called to process the
/// checksum.
/// </summary>
internal void ProcessHeader<TByteOrder>(ref BinaryProcessor<TByteOrder> processor) where TByteOrder : IByteOrder
{
processor.Position = 0;
// Process header fields
processor.Process(ref _transactionId);
processor.Process(ref _pageNumber);
processor.Process(ref _structType);
processor.Process(ref _dataType);
processor.Process(ref _level);
processor.Process(ref _ownerId);
processor.Process(ref _freeEntrySpace);
processor.Process(ref _previousPageNumber);
processor.Process(ref _nextPageNumber);
processor.Process(ref _activeStartOffset);
processor.Process(ref _activeEndOffset);
Debug.Assert(processor.Position == HeaderSize - HeaderReservedBytes, "Page header was not processed to the correct position.");
// Verify footer transaction ID
processor.Position = processor.Buffer.Length - FooterTransactionIdOffset;
if (!processor.Verify(_transactionId))
{
static void Throw() => throw new PageVerifyException(true, "Page is torn.");
Throw();
}
Debug.Assert(processor.Position == processor.Buffer.Length, "Page footer was not processed to the correct length.");
}
/// <summary>
/// Reads and verifies the checksum (for reading processers) or generates and writes the checksum (for writing processors) directly
/// from the processor buffer. Checksum verification can be done immediately after initialization. Checksum generation must be done
/// as the last processing step after the header and entries have been processed so that all the data has been written to the buffer.
/// </summary>
internal void ProcessChecksum<TByteOrder>(ref BinaryProcessor<TByteOrder> processor) where TByteOrder : IByteOrder
{
int checksum = Crc32Checksum.Calculate(processor.Buffer.Slice(0, processor.Buffer.Length - FooterSize - _freeEntrySpace));
processor.Position = processor.Buffer.Length - FooterChecksumOffset;
if (!processor.Verify(checksum))
{
static void Throw() => throw new PageVerifyException(false, "Page data is corrupt.");
Throw();
}
Debug.Assert(processor.Position == processor.Buffer.Length - FooterTransactionIdOffset, "Page checksum was not processed to the correct position.");
}
} |
If you use RuntimeHelpers.GetUninitializedObject, it is reflection-based, at least partially. |
Well...sure...but I wouldn't call a seriailizer "reflection-based" just because it uses |
I would be happy to refactor things so that GetUninitializedObject uses a different cached object on |
I don't believe my approach changes how that works, but if I missed something about that, let me know. |
63aa3ed
to
069618c
Compare
069618c
to
9a1dc54
Compare
If you are going to make all classes that participate in serialization implement an interface, you could also make them implement a https://gist.github.com/AustinWise/e6a7ef2b72eded2a947d261ec8ee6422 Obviously this gets more complicated when inheritance is involved and requires changing all existing implementations of the interface. This PR prompted me to reflect on some similar serialization systems I maintain and how newer C# features could be used to replaced |
@AustinWise One of the main motivations for our lib is eliminating the need to maintain separate serialize/deserialize logic - as per the example, there is one |
And yeah, inheritance, which we do use quite extensively in our DB engine, would get quite squirrely with a static factory method, interface or not. We have really nice inheritance usability right now. |
Just realized the boilerplate and all the inheritance issues can be avoided by splitting the instance creation from the deserialization and doing some CRGP (curiously recurring generic pattern). Hmmm... interface IBinaryProcessable<T> : IBinaryProcessable where T : IBinaryProcessable<T>
{
public static abstract T CreateInstance();
}
interface IBinaryProcessable
{
public void Process(ref BinaryProcessor processor);
}
ref struct BinaryProcessor
{
public void Process(ref int value)
{
value = 5;
}
public T Read<T>() where T : IBinaryProcessable<T>
{
T value = T.CreateInstance();
value .Process(ref this);
return value ;
}
public void Write<T>(T value) where T : IBinaryProcessable<T>
{
if (!typeof(T).IsValueType && value != null && value.GetType() != typeof(T))
throw new ArgumentException($"Cannot process subclasses of a processable object directly. Consider configuring a '{typeof(MultiTypeProcessor<,>)}'.");
value.Process(ref this);
}
} Then DTOs with inheritance can be implemented same as I have it now. Each type just needs to provide a factory that creates an empty instance to be used for deserialization. NRT warnings might still need to be suppressed if you want to provide an instance that doesn't have all the fields initialized (i.e. to avoid extraneous allocations that would need to be replaced during deserialization), but that's not terrible, at all, given the benefits. Subtypes can reuse the base type's processing logic without any fuss: public class BaseDto : IBinaryProcessable<BaseDto>
{
public int Value;
static BaseDto IBinaryProcessable<BaseDto>.CreateInstance() => new BaseDto();
void IBinaryProcessable.Process(ref BinaryProcessor processor) => Process(ref processor);
protected virtual void Process(ref BinaryProcessor processor)
{
processor.Process(ref Value);
}
}
public class DerivedDto : BaseDto, IBinaryProcessable<DerivedDto>
{
public int Value2;
static DerivedDto IBinaryProcessable<DerivedDto>.CreateInstance() => new DerivedDto();
protected override void Process(ref BinaryProcessor processor)
{
base.Process(ref processor);
processor.Process(ref Value2);
}
}
var binaryProcessor = new BinaryProcessor();
var baseDto = binaryProcessor.Read<BaseDto>();
var derivedDto = binaryProcessor.Read<DerivedDto>();
binaryProcessor.Write<BaseDto>(derivedDto); // Exception, as expected So I guess that's the new setup 😅 |
You can even do: public interface IBinaryProcessable<T> : IBinaryProcessable where T : IBinaryProcessable<T>
{
// Use this instead of catching MissingMethodException when trying to create an instance to avoid
// catching MissingMethodExceptions that the constructor might throw.
private static readonly bool _noDefaultCtor = !typeof(T).IsValueType && typeof(T).GetConstructor(Type.EmptyTypes) == null;
public static virtual T CreateInstance()
{
if (typeof(T).IsValueType)
return default!
if (_noDefaultCtor)
throw new BinaryProcessException($"Type '{typeof(T)}' does not have a public default constructor. Override '{typeof(IBinaryProcessable<T>)}.{nameof(CreateInstance)}()' to provide a factory.");
return Activator.CreateInstance<T>();
}
} To avoid even more boilerplate if you just want to use a default ctor. Love it. Thanks for the ideas that led to this revelation 🥳 |
...coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.CoreCLR.cs
Outdated
Show resolved
Hide resolved
src/coreclr/System.Private.CoreLib/src/System/RuntimeType.CoreCLR.cs
Outdated
Show resolved
Hide resolved
src/coreclr/System.Private.CoreLib/src/System/RuntimeType.ActivatorCache.cs
Show resolved
Hide resolved
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
d054d9c
to
ba9e7e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@jkotas Somewhat related: |
Your library does not implement the exact same behavior as I can believe that the perf of reflection APIs can be improved, but it is likely a lot less than what you think because of compatibility. We know that you can build much faster reflection APIs if you can change the behavior. In .NET 8, we have done that by introducing |
I was just talking about generic |
Yeah, adding a try/catch/throw completely eliminates any benefit, haha. |
Didn't know about that before you mentioned it. Wow does that ever make a big difference on AOT. Before:
After:
Thanks ❤️ v2.1 release incoming just 24h after v2.0 😅 |
Added
CreateUninitializedCache
to speed up uninitialized object creation.