-
-
Notifications
You must be signed in to change notification settings - Fork 889
Text segment ReadOnlySpan<byte> initialization #1133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
antonfirsov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's prove that the new unsafe code does not open a security hole! Otherwise looks good.
| DebugGuard.MustBeLessThan(toReverse & 0xF, Bit4Reverse.Length, nameof(toReverse)); | ||
| DebugGuard.MustBeLessThan((toReverse >> 4) & 0xF, Bit4Reverse.Length, nameof(toReverse)); | ||
| DebugGuard.MustBeLessThan((toReverse >> 8) & 0xF, Bit4Reverse.Length, nameof(toReverse)); | ||
| DebugGuard.MustBeLessThan(toReverse >> 12, Bit4Reverse.Length, nameof(toReverse)); | ||
|
|
||
| ref byte bit4ReverseRef = ref MemoryMarshal.GetReference(Bit4Reverse); | ||
|
|
||
| return (short)(Unsafe.Add(ref bit4ReverseRef, toReverse & 0xF) << 12 | ||
| | Unsafe.Add(ref bit4ReverseRef, (toReverse >> 4) & 0xF) << 8 | ||
| | Unsafe.Add(ref bit4ReverseRef, (toReverse >> 8) & 0xF) << 4 | ||
| | Unsafe.Add(ref bit4ReverseRef, toReverse >> 12)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DebugGuard won't save us in production. Where does toReverse come from? Can a malicious input result in buffer overflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I mostly added those to validate the code path in our CI tests. I had not considered possible attempts to exploit this from a security standpoint, you're right 😅
What if we figured out the upper/lower bounds of those 4 indices, and validated just those? It would still result in just 2 conditional jumps, instead of 4 (one for each access).
Actually, now that I think about it:
- The first three indices are always computed with a final
& 0xF, which means the maximum possible value is0xF, so15, which is always in range. So those first 3 accesses should always be necessarily valid, without the need to test them - That last
>> 12could result in an out of bound index, either if the value is negative and below-1 << 12, or if it's a positive value greater than1 << 12. I guess we could just check this last one just to be extra sure, as I don't know where exactly the input is coming from here, as you mentioned.
That's still just a single branch compared to 4, so... Yay? 😁
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, let's do it like that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, done in 0577690. Also added a detailed comment for future reference 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment on BitLengthOrder also applies here. Additionally, if we already on the microoptimization road, we should cache the value of toReverse >> 12 to a local, since we are using it twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 4b0dfd1.
Also I should really stress out how if we care about micro-optimizations, we should really consider refactoring our Guard APIs with that trick I suggested. Look at all that asm clutter due to that string.Format call we have here:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this is just terrible, wish we have noticed that earlier. Guard.MustBeLessThanOrEqualTo will make this method actually slower than it was before your changes. Can you just use a manual check + ThrowHelper here?
If you have some time, feel free to file a PR in SixLabors/SharedInfrastructure. For the comparison methods I'd rather see a set of duplicate non-generic methods, than TValue : IComparable<TValue>, which also brings some extra overhead I believe.
src/ImageSharp/Processing/Processors/Quantization/OctreeFrameQuantizer{TPixel}.cs
Show resolved
Hide resolved
|
|
||
| // The lengths of the bit length codes are sent in order of decreasing | ||
| // probability, to avoid transmitting the lengths for unused bit length codes. | ||
| private static readonly int[] BitLengthOrder = { 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might well be better off using the ReadOnlySpan<byte> here also and casting.
I know we do that in the jpeg encoder for uint and BitCountLut with, as I recall, a net gain in performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5b7ac75.
Should I add some bounds check here, or are all those accesses guaranteed to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't tell you for certain I'm afraid. That port was an act of desperation over understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Should I just switch those accesses to a direct Span<T>[int] access then? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we lack understanding, we shall go for safety.
Btw if we use the data in Bit4Reverse for indexing, elements shouldn't be byte-s. It's more work for the CPU:
https://sharplab.io/#v2:C4LghgzgtgPgAgJgIwFgBQcAMACbckB0ASgK4B2wAllAKYEDCA9lAA6UA2NATgMrcBulAMY0IAbnTo4AZjwJs9bAG90uXKrUsulfmGA08SAGzYARgE992AJJkAJjQAet4NIQAKLjQBmZywZYAGmxKChD7JwBKDVwVNDUEvAB2bABVMggwbzoAQTs7Tx9sIPCHR0iJeLUAXxjsOq0dPQN8EwsrWzKAIX9C33aA4IHSqLq4xLU4FPTM7II8gq9fEtCyirratGqgA==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get what you're saying now, yeah 👍
I'm not sure I understand where you're seeing that with Bit4Reverse though.
Here's where that is used:
ImageSharp/src/ImageSharp/Formats/Png/Zlib/DeflaterHuffman.cs
Lines 384 to 392 in 4b0dfd1
| int toReverseRightShiftBy12 = toReverse >> 12; | |
| Guard.MustBeLessThanOrEqualTo<uint>((uint)toReverseRightShiftBy12, 15, nameof(toReverse)); | |
| ref byte bit4ReverseRef = ref MemoryMarshal.GetReference(Bit4Reverse); | |
| return (short)(Unsafe.Add(ref bit4ReverseRef, toReverse & 0xF) << 12 | |
| | Unsafe.Add(ref bit4ReverseRef, (toReverse >> 4) & 0xF) << 8 | |
| | Unsafe.Add(ref bit4ReverseRef, (toReverse >> 8) & 0xF) << 4 | |
| | Unsafe.Add(ref bit4ReverseRef, toReverseRightShiftBy12)); |
All the indexing values there are int already 🤔
I mean, those are actually the same indices that were being used before as well. I'm not sure I'm following here, where is the byte offsetting peformed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh sorry, I was wrong the values for Bit4Reverse are not used for indexing into data. This is the data that gets indexed.
We should deal with Guard however to make sure we don't regress. Do you plan to fix it globally in SixLabors/SharedInfrastructure, or will you replace it in this PR with a manual check + ThrowHelper calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahahah no problem, glad we could figure this out, I was wondering what was I missing there 😄
As for Guard, I'd say that's an unrelated improvement, since it's not something that will impact the functionality in any way, it's just a matter of further optimizations. I guess the best thing would be to just make another PR and fix the entire Guard class, with all the included APIs. Sounds good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can do it, that would be the best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, always happy to bring more of my discoveries and optimizations to ImageSharp! 😄
JimBobSquarePants
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got failing tests here.
https://github.com/SixLabors/ImageSharp/pull/1133/checks?check_run_id=473799814#step:11:195
|
@JimBobSquarePants Whoops, fixed in 68387a7, my bad 😅 |
Codecov Report
@@ Coverage Diff @@
## master #1133 +/- ##
==========================================
- Coverage 82.24% 82.23% -0.01%
==========================================
Files 678 678
Lines 29164 29192 +28
Branches 3284 3284
==========================================
+ Hits 23986 24007 +21
- Misses 4481 4488 +7
Partials 697 697
Continue to review full report at Codecov.
|
|
@Sergio0694 @antonfirsov Are we waiting for this to be completed and merged before finishing here? |
|
@JimBobSquarePants Not sure whether we necessarily have to wait for that. I mean, having that would reduce the overhead in those checks, but it wouldn't actually change anything semantically. We might want to wait for it though in case the changes in this PR alone have too much of a performance regression, but that shouldn't be the case (plus it'd be temporary anyways). Up to you guys 👍 |
|
@JimBobSquarePants I expect the aggregate result of the changes in On the other hand, I'm fine fixing the regression in a follow-up PR. |
|
@Sergio0694 I found the AccessViolation. https://github.com/SixLabors/ImageSharp/pull/1133/checks?check_run_id=484873388#step:11:698 It's not us it's the ImageMagick reference codec. @dlemstra You'll want to see this. |
|
@JimBobSquarePants That's awesome, great work! Props to your investigative skills 😄 |
|
Ok.... The submodule is now updated and I've added updated relevant tests.
Will merge once this finishes building. Spoke too soon... The templates don't automatically run on Unix of course because there's no Visual Studio. Will have another think. Update: |
| }); | ||
|
|
||
| Assert.StartsWith("Value cannot be null.", ex.Message); | ||
| Assert.StartsWith("Parameter \"source\" must be not null.", ex.Message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also do Assert.Throws<ArgumentNullException>("source", () => here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that was dealt with above?
Prerequisites
Description
This PR includes a couple of minor optimizations:
static readonly byte[]arrays with staticReadOnlySpan<byte>values that directly map over constant data in the.textsegment (see here).Encoding.ASCII.GetBytes, no longer necessaryStream.Writecalls to theReadOnlySpan<byte>overloads.No rush at all for this, just saw those
byte[]arrays and thought I'd have some fun optimizing them away. Pinging @antonfirsov for the reviews as he's the official performance guru. 😄