-
Notifications
You must be signed in to change notification settings - Fork 49
Optimize circuit Compilation #1400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Code is now correctly formatted. |
Benchmark results Main vs HEAD.Base
Head
Base
Head
Base
Head
Base
Head
Base
Head
Base
Head
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1400 +/- ##
==========================================
+ Coverage 81.38% 81.46% +0.08%
==========================================
Files 105 105
Lines 25805 25885 +80
==========================================
+ Hits 21001 21087 +86
+ Misses 4804 4798 -6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Benchmarking resultsBenchmark for program
|
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
10.865 ± 0.051 | 10.780 | 10.944 | 4.37 ± 0.04 |
cairo-native (embedded AOT) |
2.486 ± 0.016 | 2.458 | 2.505 | 1.00 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
2.605 ± 0.012 | 2.588 | 2.622 | 1.05 ± 0.01 |
Benchmark for program dict_snapshot
Open benchmarks
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
531.8 ± 8.6 | 525.2 | 553.4 | 1.00 |
cairo-native (embedded AOT) |
2206.4 ± 38.2 | 2154.6 | 2294.3 | 4.15 ± 0.10 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
2314.2 ± 26.2 | 2278.0 | 2355.0 | 4.35 ± 0.09 |
Benchmark for program factorial_2M
Open benchmarks
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
4.773 ± 0.016 | 4.750 | 4.797 | 1.78 ± 0.02 |
cairo-native (embedded AOT) |
2.763 ± 0.074 | 2.712 | 2.959 | 1.03 ± 0.03 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
2.680 ± 0.031 | 2.632 | 2.732 | 1.00 |
Benchmark for program fib_2M
Open benchmarks
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
4.670 ± 0.011 | 4.648 | 4.686 | 2.22 ± 0.03 |
cairo-native (embedded AOT) |
2.103 ± 0.027 | 2.067 | 2.142 | 1.00 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
2.151 ± 0.012 | 2.129 | 2.165 | 1.02 ± 0.01 |
Benchmark for program linear_search
Open benchmarks
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
563.1 ± 5.2 | 555.9 | 575.2 | 1.00 |
cairo-native (embedded AOT) |
2181.3 ± 10.6 | 2161.8 | 2202.4 | 3.87 ± 0.04 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
2339.6 ± 24.1 | 2303.8 | 2382.4 | 4.15 ± 0.06 |
Benchmark for program logistic_map
Open benchmarks
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
388.4 ± 10.6 | 378.8 | 416.6 | 1.00 |
cairo-native (embedded AOT) |
2352.3 ± 52.7 | 2289.9 | 2468.8 | 6.06 ± 0.21 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
2474.0 ± 38.1 | 2435.3 | 2559.5 | 6.37 ± 0.20 |
This reverts commit 54b1f68.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like solving the OP type on runtime. It makes the code more difficult to read/maintain. Can we have 3 different functions, each solving a different operation? I think that unless the performance difference is significative, having 3 different functions is much better.
Also, could you add some execution and compilation benchmark data to the PR description?
I'll add some benchmarks. About the change you suggested, separating the switch into 3 functions makes the compilation a ~15% slower, which I think is a little high. That's why I reverted the change. I guess, the other approach enables more optimizations. I agree it would be better for maintenance. If we are willing to sacrifice that 15% in favor of more readability, I can change it back. |
0e4e2c0
to
a96317f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Should we execute some blocks to ensure correctness?
I think so. I did it before with a range of 20000 blocks and there was no issue. But since these last commits changed the code a little, I think we should. There were no substantial changes, but just to be sure |
Optimize circuit Compilation.
This PR changes how we handle arithmetic operations with circuits. Currently, for every gate, we extend all the operands to avoid overflows (to either u385 o u768), we generate the corresponding
arith
operations and then trunc the result back to u384.In x86, arithmetic operations need to go through a legalization pass. This is because x86 allows up to 2 registers per instruction, while addition, subtraction and multiplication operations need 3 registers. A circuit with enough gates can make compilation quite longer due to this.
To avoid having all these operations inlined, this PR creates function which takes care of performing the arithmetic and return the result ready to be used. This allow to significantly reduce the compilation time since the go from thousands of
arith
operations to less than 10, and very little work for the legalization pass to perform.However, this comes with the downside of adding an extra indirection. Now every circuit operation reduces to almost one function call, with the exception of the inversion operation which was left as it is now since it already had an extra indirection. This generates a regression in execution time.
Benchmarks
In both cases, compilation and execution, classes were compiled using
--opt-level 2
. Some classes may have no execution benchmarks, this is because there were no transactions invoking them.Introduces Breaking Changes?
No.
starknet-blocks.yml
workflow to use these PRs.These PRs should be merged after this one right away, in that order.
Checklist