-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/vm, crypto/blake2b: add BLAKE2b compression func at 0x09 #19972
core/vm, crypto/blake2b: add BLAKE2b compression func at 0x09 #19972
Conversation
a428296
to
cd4f2e5
Compare
To see how it works, you may want to use this simple truffle project https://github.com/pdyraga/f-precompile-call and follow the steps from |
8af3cd2
to
e5bdb96
Compare
@@ -390,7 +390,7 @@ func New(code string) (*Tracer, error) { | |||
return 1 | |||
}) | |||
tracer.vm.PushGlobalGoFunction("isPrecompiled", func(ctx *duktape.Context) int { | |||
_, ok := vm.PrecompiledContractsByzantium[common.BytesToAddress(popSlice(ctx))] | |||
_, ok := vm.PrecompiledContractsIstanbul[common.BytesToAddress(popSlice(ctx))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karalabe I decided to switch it since we are adding a new precompile and we are almost-Istanbul but I am not sure about this change. Can you please confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is an ok change. This adds a helper-method to the js-environment, and it is obviously already not quite correct (if executed pre-byzantium). This change does not really make it a lot more incorrect.
A proper fix would be to look at the block number and chain config, but those are not accessible at this point, so would need a refactor which it out of scope for this PR, imo.
This is ready for another chance. |
I'm trying to pull in the SSE, AVX and AVX2 code from upstream too. They make quite a difference on the execution:
I'll keep hacking on it today, need to understand the assembly code first, then modify it. Will push those on top of this PR when I'm dong and will ask you to take a peek. |
If we can pull them in, that's great but if not or if it's really complex and we can't make it before Friday deadline I think it's not such a big deal. According to the EIP, one round cost 1 gas (benchmarks made on the code we have in this PR) and I think it's so far the cheapest precompile. For ZCash interoperability we'll probably need to execute 10 or 12 rounds. |
@holiman @pdyraga I've pushed the SSE, AVX and AVX2 code on top. I opted to keep the entire functionality of Blake2b (instead of gutting and just shipping Performance wise the Blake2B hashes with the extracted F methods are:
I've added an 8M gas test case to the precompile to have that as a benchmark too, then the runs on my laptop with the different instructions sets are:
|
With the AVX2 instructions, we can do about 47.5Mgas/sec with the standard 12 rounds, 8Mgas/sec with 1 round. |
The precompile at 0x09 wraps the BLAKE2b F compression function: https://tools.ietf.org/html/rfc7693#section-3.2 The precompile requires 6 inputs tightly encoded, taking exactly 213 bytes, as explained below. - `rounds` - the number of rounds - 32-bit unsigned big-endian word - `h` - the state vector - 8 unsigned 64-bit little-endian words - `m` - the message block vector - 16 unsigned 64-bit little-endian words - `t_0, t_1` - offset counters - 2 unsigned 64-bit little-endian words - `f` - the final block indicator flag - 8-bit word [4 bytes for rounds][64 bytes for h][128 bytes for m][8 bytes for t_0] [8 bytes for t_1][1 byte for f] The boolean `f` parameter is considered as `true` if set to `1`. The boolean `f` parameter is considered as `false` if set to `0`. All other values yield an invalid encoding of `f` error. The precompile should compute the F function as specified in the RFC (https://tools.ietf.org/html/rfc7693#section-3.2) and return the updated state vector `h` with unchanged encoding (little-endian). See EIP-152 for details.
7c174b3
to
1bccafe
Compare
what will be the gas cost? |
It's specified in the EIP. |
This PR looks good to me. My and @karalabe also did some fuzzing which did not find any discrepancies between the various flavours of assembly |
1/round AFAIK (+the cost of the contract call itself, which will outweigh
it anyway). A 12 round call (standard blake) takes about 250ns on our
implementation, so filling an 8M gas block would take 166ms.
…On Wed, Aug 21, 2019, 15:09 Tomasz Kajetan Stańczak < ***@***.***> wrote:
what will be the gas cost?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19972?email_source=notifications&email_token=AAA7UGPUL6D55ICZN6KWK63QFUWBBA5CNFSM4IMHYTY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ZODGI#issuecomment-523428249>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA7UGNMXYYKPSJGC4X6Z4DQFUWBBANCNFSM4IMHYTYQ>
.
|
My marks:
So |
@karalabe are you sure you got the |
I just used whatever is in upstream. Btw, gccgo is an alternative Go
compiler based on the gcc toolkit. Originally it was an experiment for
faster Go binaries, but it was a one guy project and afaik cannot keep up.
My guess is that gccgo doesn't implement the assembly, since Go's ASM is
custom that needs compilation, not just embedding.
…On Wed, Aug 21, 2019, 15:36 Martin Holst Swende ***@***.***> wrote:
@karalabe <https://github.com/karalabe> are you sure you got the gccgo
right in the buld instructions? Seems like if you have gccgo enabled, it
will use the generic version, and vice versa
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19972?email_source=notifications&email_token=AAA7UGNVF7RDTNBC2XY4E5TQFUZDJA5CNFSM4IMHYTY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ZQHOA#issuecomment-523436984>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA7UGIXKP7ZXDZABOB53BLQFUZDJANCNFSM4IMHYTYQ>
.
|
Re instruction set, that's decided runtime by the Blake code itself (the F
function looks at the CPU capabilities and switches dynamically). It uses
the highest SIMD version you have available. From Go's perspective, all 4
variants are plain amd64, the binary is not hard coded on one or another.
…On Wed, Aug 21, 2019, 15:29 Martin Holst Swende ***@***.***> wrote:
My marks:
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/vm
BenchmarkPrecompiledBlake2F/vector_4-Gas=0-6 <http://github.com/ethereum/go-ethereum/core/vmBenchmarkPrecompiledBlake2F/vector_4-Gas=0-6> 20000000 124 ns/op
BenchmarkPrecompiledBlake2F/vector_5-Gas=12-6 5000000 244 ns/op
BenchmarkPrecompiledBlake2F/vector_6-Gas=12-6 5000000 253 ns/op
BenchmarkPrecompiledBlake2F/vector_7-Gas=1-6 10000000 124 ns/op
BenchmarkPrecompiledBlake2F/vector_8-Gas=8000000-6 20 86059659 ns/op
So 86ms for the 8Mgas vector. I don't know what instruction set was
used... Looking into the build flags now, I'm not fully sure it's correct,
my IDE tells me I'm using the generic variant, but I have an amd64
processor...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19972?email_source=notifications&email_token=AAA7UGPZPIQ6BN6E52TFCYDQFUYKDA5CNFSM4IMHYTY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ZPU2Q#issuecomment-523434602>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA7UGMPEDYL5I7JVHM5WPDQFUYKDANCNFSM4IMHYTYQ>
.
|
Ok, my IDE seems confused. With some self-induced |
🎉 Having the precompile already implemented in |
The precompile at
0x09
wraps the BLAKE2b F compression function:https://tools.ietf.org/html/rfc7693#section-3.2
The precompile requires 6 inputs tightly encoded, taking exactly 213 bytes, as explained below.
rounds
- the number of rounds - 32-bit unsigned big-endian wordh
- the state vector - 8 unsigned 64-bit little-endian wordsm
- the message block vector - 16 unsigned 64-bit little-endian wordst_0, t_1
- offset counters - 2 unsigned 64-bit little-endian wordsf
- the final block indicator flag - 8-bit wordThe boolean
f
parameter is considered astrue
if set to1
.The boolean
f
parameter is considered asfalse
if set to0
.All other values yield an invalid encoding of
f
error.The precompile should compute the F function as specified in the RFC
(https://tools.ietf.org/html/rfc7693#section-3.2) and return the updated
state vector
h
with unchanged encoding (little-endian).See EIP-152 for details.