-
Notifications
You must be signed in to change notification settings - Fork 820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve stability of CI tests #878
Conversation
* Set default nat mode to NONE instead of AUTO * Ignore the flakey rocksdb unit test. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
* dump memory before shutting down cluster Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
* plumb option to CLI via -X command Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
In both the threaded nodes and the harness set the entroy gathering device URL to empty string so the default SecureRandom will be DRBG using non-blocking randomness. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
… circleci Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Oh, yea. That was me. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
...c/main/java/org/hyperledger/besu/ethereum/p2p/discovery/internal/IndirectVertxTimerUtil.java
Outdated
Show resolved
Hide resolved
timers.put( | ||
id, | ||
secheduledExecutor.scheduleAtFixedRate( | ||
() -> vertx.executeBlocking(e -> handler.handle(), r -> {}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue to investigate here is whether this will cause threading issues. IIRC, VertexTimerUtil
guarantees the timed callbacks happen in the calling thread, and I think we need that as the code is currently written ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be the event loop or io thread in this case. Doing cryptography in the event loop or io thread could be a limiter to the number of concurrent peers we can service, especially when it could block on randomness.
While I haven't proven it yet I think this (too much processing, like walking a 4000 entry hash table) is what's causing our eth/65 issues, so getting stuff off of the io thread looks to be a long term goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. We just need to make sure that the code we're executing with this timer is thread-safe.
...c/main/java/org/hyperledger/besu/ethereum/p2p/discovery/internal/IndirectVertxTimerUtil.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flakiness-master failures: 118/25 avg:4.72
refs/pull/878/head failures: 36/14 avg:2.5714285714285716
The burn-in testing on this looks good. When you drill into the logs I attached, you'll see that there's only one kind of failure in this branch and it's due to something @shemnon said he can roll back.
878-burn-in-results.zip
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Because it's just a single test that is consistently flakey I |
Ok, in that case let's add the "fixes ..." for all the https://github.com/hyperledger/besu/issues?q=is%3Aopen+is%3Aissue+label%3Aflake |
* Use non-blocking randomness for acceptance tests This addresses entropy draining during unit tests. * wait 30 seconds instead of 2 when killing AT processes * mark NodeSmartContractPermissioningIbftStallAcceptanceTest as @ignore since it has become reliably and specifically flakey. Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com> Signed-off-by: Sally MacFarlane <sally.macfarlane@consensys.net>
Signed-off-by: Danno Ferrin danno.ferrin@gmail.com