Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Event Hubs] Optimize partition resolver compute hash #27598

Merged
merged 16 commits into from
Mar 29, 2022

Conversation

danielmarbach
Copy link
Contributor

@danielmarbach danielmarbach commented Mar 16, 2022

Relates to #27030

Benchmark uses the same string of various sizes (see size column in the screenshot) as a partition key.

With SkipLocalsInit

image

Without SkipLocalsInit

image

Contributing to the Azure SDK

Please see our CONTRIBUTING.md if you are not familiar with contributing to this repository or have questions.

For specific information about pull request etiquette and best practices, see this section.

@ghost ghost added Event Hubs customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Mar 16, 2022
@ghost
Copy link

ghost commented Mar 16, 2022

Thank you for your contribution @danielmarbach! We will review the pull request and get back to you soon.

@ghost ghost added the Community Contribution Community members are working on the issue label Mar 16, 2022
@danielmarbach
Copy link
Contributor Author

@jsquire I think these are great results. I'm sure we could squeeze out more by actually also doing something with the hash algorithm but I felt this might be too complex for now for me to dive into and like you mentioned we would need to be super extra careful not not break anything. With these changes we can get nice gains without really touching the hashing function that much.

@danielmarbach
Copy link
Contributor Author

SkipLocalsInit is removed. Last oustanding question is whether to ifdef or not. Otherwise I guess this is more or less ready to rumble.

@danielmarbach danielmarbach marked this pull request as ready for review March 17, 2022 21:37
Copy link
Member

@jsquire jsquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we think we're just about ready, I gave it the nitpick formatting pass for consistency with other Event Hubs assets.

By chance, did you run any tests comparing the output of the old and new implementation across a large set of random strings? If not, I'll do that before we merge for confidence that we didn't accidentally change behavior.

@danielmarbach
Copy link
Contributor Author

@jsquire

By chance, did you run any tests comparing the output of the old and new implementation across a large set of random strings? If not, I'll do that before we merge for confidence that we didn't accidentally change behavior.

I did not yet. Normally, what I would do is write an explicit test that copies the old implementation local into the test and then exercise the new implementation together with the old and verify that for the same input the same output occurs on both sides. If you are OK with that approach, I'm happy to push that.

@jsquire
Copy link
Member

jsquire commented Mar 21, 2022

I did not yet. Normally, what I would do is write an explicit test that copies the old implementation local into the test and then exercise the new implementation together with the old and verify that for the same input the same output occurs on both sides. If you are OK with that approach, I'm happy to push that.

I wasn't even thinking that level of rigor. I was going to copy the old + new into LinqPad and just run in a giant loop with a few tens of thousands of random inputs.

@danielmarbach
Copy link
Contributor Author

I'm happy to take your offer then. But I'm also OK to do it if needed since I don't want to leave the impression I'm not willing to take things that I started to completion

@jsquire
Copy link
Member

jsquire commented Mar 22, 2022

I'm happy to take your offer then. But I'm also OK to do it if needed since I don't want to leave the impression I'm not willing to take things that I started to completion

No impression left; you're doing us enough of a kindness. I'll need a day or so, but I'll run the side-by-side test.

@jsquire
Copy link
Member

jsquire commented Mar 25, 2022

We ended up with inconsistent results compared to the original implementation. I traced this back to BitConverter reading with the correct endian-ness for the host where the new approach is using BinaryPrimitives and assuming big endian reads.

I was able to work around this by aliasing the BinaryPrimitives reads:

private delegate uint ReadUInt32(ReadOnlySpan<byte> bytes);

private static ReadUInt32 ReadUInt32CorrectEndian = BitConverter.IsLittleEndian
    ? BinaryPrimitives.ReadUInt32LittleEndian
    : BinaryPrimitives.ReadUInt32BigEndian;
// ... [SNIP]

while (size > 12)
{
    a += ReadUInt32CorrectEndian(data.Slice(index));
    b += ReadUInt32CorrectEndian(data.Slice(index + 4));
    c += ReadUInt32CorrectEndian(data.Slice(index + 8));

// ... [SNIP]

I did not (yet) have a chance to run another benchmark pass.

@danielmarbach
Copy link
Contributor Author

danielmarbach commented Mar 25, 2022

Could we use a method that can be inlined that uses something like

BitConverter.IsLittleEndian ? BinaryPrimitives.ReverseEndianness(tempVal) : tempVal

or do we need to be CLS compliant?

I should have caught the endian difference while rewriting the code. Nice catch!

@jsquire
Copy link
Member

jsquire commented Mar 25, 2022

I'm open to it, sure. I don't think that CLS compliant comes into play unless we're doing something with the public API, no?

@danielmarbach
Copy link
Contributor Author

I can rerun the perf test later tonight with a proposal

@jsquire
Copy link
Member

jsquire commented Mar 25, 2022

I generated a test bed when I was doing the compat runs, if that helps. It's in a simple JSON array format.
partition-key-hashes.txt

@danielmarbach
Copy link
Contributor Author

Hmmm... It does do an endian check.

https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/System.Private.CoreLib/src/System/Buffers/Binary/ReaderBigEndian.cs#L127

I need to have a closer look what's going on there

@jsquire
Copy link
Member

jsquire commented Mar 25, 2022

@jsquire I'm starting to wonder if this is not a bug in the original implementation. BitConverter.ToUInt32 does an unaligned read. This means the compute hash will depending on the endianness for the same input return different output. Was that intended? Wouldn't you want for the same partition key input always the same output?

Hrmm.... I hadn't stopped to question the reference version, but yeah, I can see where you're coming from. Yes, I agree. We should consistently read and interpret the bytes. Nice call!

@jsquire
Copy link
Member

jsquire commented Mar 28, 2022

The more that I thought about this over the weekend and played around, the more I'm convinced that your observation is spot on - we should assume a consistent endian-ness for the conversion. I think this means that your original version was right. I would ask that we assume little endian rather than big, though, since that will be natural to the majority of hosts.

Thoughts?

@danielmarbach
Copy link
Contributor Author

I would ask that we assume little endian rather than big, though, since that will be natural to the majority of hosts.

Done. Do we need to worry about backward compatibility?

@jsquire
Copy link
Member

jsquire commented Mar 28, 2022

Done. Do we need to worry about backward compatibility?

Thankfully, no. We're running with a slightly different version than the service currently uses, by request, and we've set expectations that the buffered producer will assign partitions differently than the producer. Once we GA, we'll need to avoid client-side differences, but we're still in beta.

Copy link
Member

@jsquire jsquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @danielmarbach, as always, for your taking this on and your flexibility talking through the ambiguity. Pedantic formatting guy made one last pass, but I think the important parts are good to go.

Co-authored-by: Jesse Squire <jesse.squire@gmail.com>
@danielmarbach
Copy link
Contributor Author

@jsquire let me know what you think about the recent discussion point I brought up regarding following more the best practices of always stack allocating the max size and how in that scenario having SkipLocalsInit would actually improve things quite a bit.

On another note: Would you be able to run your fancy script against the latest changes?

This reverts commit 14912a3.

# Conflicts:
#	sdk/eventhub/Azure.Messaging.EventHubs/src/Core/PartitionResolver.cs
@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

let me know what you think about the recent discussion point I brought up regarding following more the best practices of always stack allocating the max size and how in that scenario having SkipLocalsInit would actually improve things quite a bit.

I'm not opposed to bringing it back; I just question how much of an impact it will have with our netstandard2.0 target. If it's being evaluated by the jitter and would impact newer runtimes for applications using the SDK, I think it's a definite win.

@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

On another note: Would you be able to run your fancy script against the latest changes?

Will do. I'll do another pass today.

@danielmarbach
Copy link
Contributor Author

let me know what you think about the recent discussion point I brought up regarding following more the best practices of always stack allocating the max size and how in that scenario having SkipLocalsInit would actually improve things quite a bit.

I'm not opposed to bringing it back; I just question how much of an impact it will have with our netstandard2.0 target. If it's being evaluated by the jitter and would impact newer runtimes for applications using the SDK, I think it's a definite win.

It instructs the compiler to not add the locals init. So, the important part is to compile the code that uses it with a roslyn compiler version that supports that. I guess that is given here right?

@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

It instructs the compiler to not add the locals init. So, the important part is to compile the code that uses it with a roslyn compiler version that supports that. I guess that is given here right?

I'll be interesting to see if the net6 SDK compiler does the right thing for a netstandard2.0 target. We'll always be current with the SDK/compiler versions, it's the targets that will lag behind for compatibility.

@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

On another note: Would you be able to run your fancy script against the latest changes?

Will do. I'll do another pass today.

Test pass is solid. 260,000+ random key variations with different patterns and results are consistent.

@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

/azp run net - eventhub - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

@danielmarbach: Looks like we're all wrapped up and ready. Please let me know if you'd like me to merge or if you're still making tweaks.

@danielmarbach
Copy link
Contributor Author

Go ahead

@jsquire jsquire merged commit c1d1f81 into Azure:main Mar 29, 2022
@jsquire
Copy link
Member

jsquire commented Mar 29, 2022

Done! Thank you, again, @danielmarbach. Your contribution and efforts to keep improving our performance and efficiency are very much appreciated.

@danielmarbach danielmarbach deleted the partitionkey-resolver branch March 30, 2022 04:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants