-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EVM Speed Improvements for MSTORE #2917
Conversation
More EVM Speed Improvements aimed at improving MLOAD times. * Move Memory from raw bytes to byte array * Add calls for 32 byte MSB aligned copies * Remove `incrementProgramCounter` and move data into OperationResult Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
48% improvement over all test cases, 266% improvement in mstore. Slowest operation is now SHA3/KECCAK |
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Casting an int out of Bytes can be either ArithmeticException or IllegalArgumentException, so treat all Runtime exceptions as invalid destinations. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Trim leading zeros on jump. Since it is unsigned we don't need to worry about sign-extended negative numbers. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
@@ -184,7 +164,7 @@ public boolean isValidJumpDestination(final UInt256 destination, final Code code | |||
} | |||
} | |||
} | |||
long targetLong = validJumpDestinations[jumpDestination >> 6]; | |||
long targetLong = validJumpDestinations[jumpDestination >>> 6]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stared at this for a bit before I saw the jumpDestination < 0 check on 154.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, the only question I have is the potential performance tradeoff between byte[] for faster MSTORE and MutableBytes for read performance since @matkt made that optimization.
@@ -36,17 +37,17 @@ | |||
* overflow this. A byte array implementation limits us to 2 GiB. But that would cost over 51 | |||
* trillion gas. So this is likely a reasonable limitation, at least at first. | |||
*/ | |||
private MutableBytes data; | |||
private byte[] memBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a mix of the two. This goes back to byte[]
but does not use the array copying code that was in place before. It directly slices the array into Byte
. The code segments that needed to copy bytes are still doing a copy on the resulting byte object. Those were introduced by MutableBytes and not removed, and are actually orthogonal to the backing data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bigger issue is that doing right-aligned copying in MutableBytes efficiently requires APIs that don't exist on it.
final BigInteger value0 = new BigInteger(1, frame.popStackItem().toArrayUnsafe()); | ||
final BigInteger value1 = new BigInteger(1, frame.popStackItem().toArrayUnsafe()); | ||
|
||
final BigInteger result = value0.add(value1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we seeing better performance using BigInteger rather than UInt256?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The use of UInt256 incurred a few more byte[]->int[]->byte[] loops. It is a more pronounced improvement in div, mul, and mod as there are less cycles spent moving bytes into new objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The javadoc on Memory.set32BytesAligned could stand some clarification. It is confusing as written. I've included suggestions inline.
@@ -202,7 +202,7 @@ public Bytes getBytes(final long location, final long numBytes) { | |||
final int start = asByteIndex(location); | |||
|
|||
ensureCapacityForBytes(start, length); | |||
return data.slice(start, length); | |||
return Bytes.wrap(Arrays.copyOfRange(memBytes, start, start + length)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Tuweni .slice sounds mutable based on their docs, but i may be misreading, sounds ambiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be mutable if the underlying bytes are mutable. We got burned on that with #2908.
} | ||
|
||
/** | ||
* Copy the bytes from the provided number of bytes from the provided value to memory from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be worded better. Suggest "Copy numBytes bytes from taintedValue to memory at location" or some variant with fewer froms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just re-wrote it to my liking. Initially it was copy-paste from the left aligned variant.
* <p>Note that this method will extend memory to accommodate the location assigned and bytes | ||
* copied and so never fails. | ||
* | ||
* @param location the location in memory at which to start copying the bytes of {@code value}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding "to" at the end of this description helps
* numBytes} (in other words, {@link #getActiveWords()} will return a value consistent with | ||
* having set {@code numBytes} bytes, even if less than that have been concretely set due to | ||
* {@code value} being smaller). | ||
* @param taintedValue the bytes to copy to memory from {@code location}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest s/from/at
if (length > srcSize) { | ||
final MutableBytes paddedAnswer = MutableBytes.create((int) length); | ||
if ((long) 0 < srcSize) { | ||
value.slice(0, (int) ((long) srcSize)).copyTo(paddedAnswer, (int) (length - srcSize)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is casting srcSize to long and then back to int doing when defining the length of the slice being copied?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how neither IntelliJ nor Errorprone tagged it. Initially it was because memory was going to be long indexed, until I was satisfied that 33rd bit memory access was unlikely to ever be viable even in perverse circumstances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. Sonar flagged it.
@@ -28,6 +30,8 @@ | |||
|
|||
private final int length; | |||
|
|||
private final OperationResult pushResponse; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, why pull this down from successResponse
in parent? readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each push response has a different success, counting the PC to increment (2 for PUSH1, 33 for PUSH32). The parent successResponse increments the PC by 1.
sick. how do we get that spreadsheet generated as part of CI? I'd love to be alerted to performance regressions. |
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Bytes.wrap() presumes an immutable array, and only does a shallow copy on a perfect "slice." MutableBytes assumes the bytes could change and always does a deep copy when a copy is requested. Also, updated API to reflect mutable bytes type usage. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Kudos, SonarCloud Quality Gate passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢 👍
More EVM Speed Improvements aimed at improving MSTORE times. * Move Memory from raw bytes to byte array * Add calls for right packed partial value copy * Remove `incrementProgramCounter` and move data into OperationResult Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
More EVM Speed Improvements aimed at improving MSTORE times. * Move Memory from raw bytes to byte array * Add calls for right packed partial value copy * Remove `incrementProgramCounter` and move data into OperationResult Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
PR Description
More EVM Speed Improvements aimed at improving MSTORE times.
incrementProgramCounter
and move data into OperationResultSigned-off-by: Danno Ferrin danno.ferrin@gmail.com
Changelog