Skip to content
This repository has been archived by the owner on Jul 1, 2021. It is now read-only.

tests: update to match py-evm: don't skip Constantinople, add Petersburg, use --fork; break out long-running categories #608

Merged
merged 8 commits into from
May 21, 2019

Conversation

veox
Copy link
Contributor

@veox veox commented May 14, 2019

What was wrong?

Tests for network=='Constantinople' are being skipped.

The pinned git submodule is for https://github.com/ethereum/tests/tree/420f443477caa8516f1f9ee8122fafc3415c0f34 (same as tag v6.0.0-beta.2, which is not even latest), whereas py-evm has https://github.com/ethereum/tests/tree/6b85703b568f4456582a00665d8a3e5c3b20b484 (untagged commit slightly after v6.0.0-beta.3, which itself does not include ConstantinopleFix - which why we're using something "slightly after").

How was it fixed?

  • Skip network=='Istanbul' (leave placeholder).
  • Use same fixtures commit as py-evm.
  • Add petersburg jobs to CI.
  • Use pytest --fork arg (copy-paste! from py-evm).
  • Address test failures (the usual disparity on "synthetic state").
  • Address long CI run times:
    • break out categories into their own CI jobs:
      • zero-knowledge;
      • SSTORE;
    • switch these break-out categories to use Python 3.7 (a tiny bit more performant than 3.6).

To-Do

  • Clean up commit history

Cute Animal Picture

Felicia the ferret slides out of a wave guide

Source: Felicia the ferret of Fermilab; via Atlas Obscura

@veox
Copy link
Contributor Author

veox commented May 14, 2019

Failure is in the usual "incorrect upstream" disagreement.

@veox veox force-pushed the dont-skip-constantinople-tests branch from dfd036b to 4addf50 Compare May 14, 2019 14:58
@veox
Copy link
Contributor Author

veox commented May 14, 2019

Enabled ConstantinopleFix, too, out of curiosity; but no actual CI job configured for that, yet).


EDIT: Oh, wait; we're pinned to a version of fixtures that doesn't have ConstantinopleFix yet. 😆

@veox veox force-pushed the dont-skip-constantinople-tests branch from 4addf50 to c7cc127 Compare May 14, 2019 15:13
@@ -115,6 +115,24 @@
# The result is in conflict with the yellow-paper:
# * https://github.com/ethereum/py-evm/pull/1224#issuecomment-418800369
('GeneralStateTests/stRevertTest/RevertInCreateInInit_d0g0v0.json', 'RevertInCreateInInit_d0g0v0_Byzantium'), # noqa: E501
('GeneralStateTests/stRevertTest/RevertInCreateInInit_d0g0v0.json', 'RevertInCreateInInit_d0g0v0_Constantinople'), # noqa: E501
Copy link
Contributor Author

@veox veox May 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This INCORRECT_UPSTREAM_TESTS blob is the same as the one in py-evm. It matches 1:1 the particular py-evm version that trinity is pinned to.

Wonder if we should make it more accessible in py-evm so it can be imported here instead of copy-pasting.


Added a TODO comment so this is not forgotten.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you up for making the changes and/or making an issue in py-evm to expose this API?

Copy link
Contributor Author

@veox veox May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make issues, as you outlined in:

Yeah, to avoid committing the TODO's to master, how about an issue in trinity that links to an issue in py-evm which lists the methods to expose in the API.


@carver
Copy link
Contributor

carver commented May 14, 2019

Skip network=='ConstantinopleFix' instead.

Wait, shouldn't we be running Constantinople(Fix) tests? I thought everything was passing now.

@veox
Copy link
Contributor Author

veox commented May 15, 2019

We've not even pinned the git submodule for fixtures that has ConstantinopleFix. That happened in py-evm only. So it is being tested, just not in this repo.

The latest state of this PR skips network=='Istanbul', which is equally unrepresented in the fixtures purely as a placeholder.

Updated PR summary with current state of affairs.

@veox veox changed the title [WIP] tests: don't skip fixtures for network 'Constantinople' [WIP] tests: update to match py-evm (don't skip fixtures for Constantinople, add fixtures for Petersburg) May 15, 2019
@veox veox force-pushed the dont-skip-constantinople-tests branch from c49dfe1 to 56ea99f Compare May 15, 2019 14:11
tox.ini Outdated Show resolved Hide resolved
@veox veox force-pushed the dont-skip-constantinople-tests branch from 845cd46 to eb5220a Compare May 16, 2019 11:01
@veox
Copy link
Contributor Author

veox commented May 16, 2019

Constantinople and Petersburg CI runs take >30 minutes on my free CircleCI account. Adding entries to SLOW_TESTS...

'Call1024PreCalls_d0g0v0_EIP150',
'Call1024PreCalls_d0g0v0_Byzantium',
'Call1024PreCalls_d0g0v0_EIP150',
'Call1024PreCalls_d0g0v0_EIP158',
Copy link
Contributor Author

@veox veox May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above six lines are duplicates of the first three lines in the list: therefore removed.

@veox veox force-pushed the dont-skip-constantinople-tests branch 2 times, most recently from 260613c to 75aa66c Compare May 16, 2019 13:30
'ForkStressTest_Byzantium',
'Create2Recursive_d0g0v0',
'Create2Recursive_d0g1v0',
'ForkStressTest',
Copy link
Contributor Author

@veox veox May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is slow across all forks - however, removal of fork qualifier means that any fixture that has this string will be marked SLOW.

Is this acceptable?

@veox
Copy link
Contributor Author

veox commented May 16, 2019

PR ready for review, I guess. (Test runs might still take very long, though.)

@veox veox changed the title [WIP] tests: update to match py-evm (don't skip fixtures for Constantinople, add fixtures for Petersburg) tests: update to match py-evm (don't skip fixtures for Constantinople, add fixtures for Petersburg) May 16, 2019
@carver
Copy link
Contributor

carver commented May 16, 2019

Hm, yeah, 30min is pretty long. Byzantium only takes 6min:

============ 4698 passed, 4751 skipped, 1 xfailed in 373.23 seconds ============

Constantinople:

========== 10323 passed, 10384 skipped, 5 xfailed in 1920.96 seconds ===========

I just checked the native runs in py-evm, and the total test counts look about right. (wow 2x more tests!) Interestingly, Constantinople is less than 2x slower in native, but >5x slower over RPC.

There aren't anymore really long individual tests left, so I don't see an obvious way to quickly shorten the long run. :( It should help a bit when we release & import the latest py-evm. But since most of the time seems to be RPC overhead, I guess it won't help that much.

Copy link
Contributor

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, to avoid committing the TODO's to master, how about an issue in trinity that links to an issue in py-evm which lists the methods to expose in the API.

30+ minute test jobs make me sad, but I'm not sure where to start on optimizing those yet. My first guess is that it's somewhere in lahja, since I don't remember RPC adding that much overhead before it was handled by events. cc @cburgdorf @lithp

@@ -115,6 +115,24 @@
# The result is in conflict with the yellow-paper:
# * https://github.com/ethereum/py-evm/pull/1224#issuecomment-418800369
('GeneralStateTests/stRevertTest/RevertInCreateInInit_d0g0v0.json', 'RevertInCreateInInit_d0g0v0_Byzantium'), # noqa: E501
('GeneralStateTests/stRevertTest/RevertInCreateInInit_d0g0v0.json', 'RevertInCreateInInit_d0g0v0_Constantinople'), # noqa: E501
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you up for making the changes and/or making an issue in py-evm to expose this API?

@cburgdorf
Copy link
Contributor

cburgdorf commented May 16, 2019

My first guess is that it's somewhere in lahja, since I don't remember RPC adding that much overhead before it was handled by events. cc @cburgdorf @lithp

@carver If I'm not mistaken, the only thing that the JSON-RPC server uses the event bus for is to make requests to the PeerPool (e.g. to get the current peer count) but that shouldn't come into play here I guess. Worth taking a closer look into this.

@carver
Copy link
Contributor

carver commented May 16, 2019

Yeah, you're right I was conflating two things: the event bus, and rpc running as an isolated plugin. I should have said the latter. So the overhead would be about sending the chain requests over multiprocessing.

@cburgdorf
Copy link
Contributor

So the overhead would be about sending the chain requests over multiprocessing

Mmh..that should not have changed the tests. It's always been that way that the RPC tests would connect via IPC and make the request. In fact, the RPC tests do not even know about the plugin. They just spin up the IPCServer directly (nothing changed about that).

trinity/tests/conftest.py

Lines 298 to 320 in a37f26c

@pytest.mark.asyncio
@pytest.fixture
async def ipc_server(
monkeypatch,
event_bus,
jsonrpc_ipc_pipe_path,
event_loop,
chain_with_block_validation):
"""
This fixture runs a single RPC server over IPC over
the course of all tests. It yields the IPC server only for monkeypatching purposes
"""
rpc = RPCServer(
initialize_eth1_modules(chain_with_block_validation, event_bus)
)
ipc_server = IPCServer(rpc, jsonrpc_ipc_pipe_path, loop=event_loop)
asyncio.ensure_future(ipc_server.run(), loop=event_loop)
try:
yield ipc_server
finally:
await ipc_server.cancel()

@carver
Copy link
Contributor

carver commented May 16, 2019

So the overhead would be about sending the chain requests over multiprocessing

Mmh..that should not have changed the tests. It's always been that way that the RPC tests would connect via IPC and make the request. In fact, the RPC tests do not even know about the plugin. They just spin up the IPCServer directly (nothing changed about that).

trinity/tests/conftest.py

Lines 298 to 320 in a37f26c

@pytest.mark.asyncio
@pytest.fixture
async def ipc_server(
monkeypatch,
event_bus,
jsonrpc_ipc_pipe_path,
event_loop,
chain_with_block_validation):
"""
This fixture runs a single RPC server over IPC over
the course of all tests. It yields the IPC server only for monkeypatching purposes
"""
rpc = RPCServer(
initialize_eth1_modules(chain_with_block_validation, event_bus)
)
ipc_server = IPCServer(rpc, jsonrpc_ipc_pipe_path, loop=event_loop)
asyncio.ensure_future(ipc_server.run(), loop=event_loop)
try:
yield ipc_server
finally:
await ipc_server.cancel()

I'm talking about the time before RPCServer was moved to a BaseIsolatedPlugin and just ran in the main process. Quite a while ago, now.

Edit: last Aug according to ethereum/py-evm@1006923

Edit 2: Okay, so not as big effect as I expected from ethereum/py-evm#1212 -- the previous PR to merge was ethereum/py-evm#1297

  • embedded
    • py36-native-state-byzantium: 1:31
    • py36-rpc-state-byzantium: 4:32 (~3x)
  • isolated plugin
    • py36-native-state-byzantium: 2:00
    • py36-rpc-state-byzantium: 6:40 (~3.3x)

So something is still missing about why the rpc gap widened from 3x to 5x since then...

@cburgdorf
Copy link
Contributor

I'm talking about the time before RPCServer was moved to a BaseIsolatedPlugin and just ran in the main process. Quite a while ago, now.

Yes, but what I mean is: From the perspective of these tests, nothing has changed when the RPCServer moved into an BaseIsolatedPlugin. Afaik, the tests do not spin up a Trinity instance but instead just manually instantiate the RPCServer. In fact, I think the RPCServer in the tests just runs in the same process all the other tests run.

@carver
Copy link
Contributor

carver commented May 16, 2019

I'm talking about the time before RPCServer was moved to a BaseIsolatedPlugin and just ran in the main process. Quite a while ago, now.

Yes, but what I mean is: From the perspective of these tests, nothing has changed when the RPCServer moved into an BaseIsolatedPlugin. Afaik, the tests do not spin up a Trinity instance but instead just manually instantiate the RPCServer. In fact, I think the RPCServer in the tests just runs in the same process all the other tests run.

Ah right, which also explains why there wasn't much of a hit from ethereum/py-evm#1212

@veox
Copy link
Contributor Author

veox commented May 17, 2019

30+ minute test jobs make me sad, but I'm not sure where to start on optimizing those yet.

We could switch those CI jobs to run on Python 3.7 instead of 3.6...

That said, it might not be that big of a deal for someone not using CircleCI's free tier.


Another approach is to run some classes of tests separately, similar to how stQuadraticComplexityTest are already done; candidates:

  • ZeroKnowledge (>1300 items, many last 0.15 seconds each; easily 200 seconds total);
  • recursive and stack exhaustion (~100 (orly?) items, up to a few seconds each; easily 120 seconds total).

I've made a separate branch to test out the latter hypothesis.

EDIT:

So, a matching reduction from the previous 30+ minutes. It's a viable strategy...


A yet another approach would be to avoid MiningChain in pytest, as these tests have to be set up so they can be marked skip (and therefore have to be torn down afterwards). Haven't measured how much of an effect this has on the overall run duration, though.

BTW, could it be that MiningChain is only a remnant of the testing that had to be done in py-evm, and it's a leftover in trinity? Or is this something that is used in Hive?..

Testing hypothesis as separate PR #623.

@carver
Copy link
Contributor

carver commented May 17, 2019

30+ minute test jobs make me sad, but I'm not sure where to start on optimizing those yet.

We could switch those CI jobs to run on Python 3.7 instead of 3.6...

👍

That said, it might not be that big of a deal for someone not using CircleCI's free tier.

Eh, max(len(job) for job in ci_jobs) still matters, even with high concurrency. So I like your next approach...

Another approach is to run some classes of tests separately, similar to how stQuadraticComplexityTest are already done; candidates:

  • ZeroKnowledge (>1300 items, many last 0.15 seconds each; easily 200 seconds total);
  • recursive and stack exhaustion (~100 (orly?) items, up to a few seconds each; easily 120 seconds total).

I've made a separate branch to test out the latter hypothesis.

EDIT:

So, a matching reduction from the previous 30+ minutes. It's a viable strategy...

Yeah, I'm definitely in favor of this direction.

@veox veox changed the title tests: update to match py-evm (don't skip fixtures for Constantinople, add fixtures for Petersburg) [WIP] tests: update to match py-evm (don't skip Constantinople, add Petersburg) May 19, 2019
@veox veox force-pushed the dont-skip-constantinople-tests branch 2 times, most recently from 0c27da1 to f516371 Compare May 19, 2019 13:16
@veox
Copy link
Contributor Author

veox commented May 19, 2019

Identified stSStoreTest as the largest offender. On a check-everything local pytest tests run, fixtures from that dir made up 5722 of 10000 slowest test runs.

Breaking them out further reduced Petersburg job run time to 04:10 (4:41 job total).

... But now the sstore CI job takes 19:20 to run (20:03 job total): it runs the tests for all forks. :/

PR #623 is extremely likely to help with this: setting up the tests also takes very long:

========================== slowest 50 test durations ===========================
1.31s setup    tests/json-fixtures-over-rpc/test_rpc_fixtures.py::test_rpc_against_fixtures[/home/circleci/repo/fixtures/BlockchainTests/GeneralStateTests/stSStoreTest/sstore_combinations_initial2_d656g0v0.json:sstore_combinations_initial2_d656g0v0_Constantinople:Constantinople-MiningChain]
1.28s setup    tests/json-fixtures-over-rpc/test_rpc_fixtures.py::test_rpc_against_fixtures[/home/circleci/repo/fixtures/BlockchainTests/GeneralStateTests/stSStoreTest/sstore_combinations_initial2_d384g0v0.json:sstore_combinations_initial2_d384g0v0_Constantinople:Constantinople-Chain]
1.28s setup    tests/json-fixtures-over-rpc/test_rpc_fixtures.py::test_rpc_against_fixtures[/home/circleci/repo/fixtures/BlockchainTests/GeneralStateTests/stSStoreTest/sstore_combinations_initial2_d680g0v0.json:sstore_combinations_initial2_d680g0v0_ConstantinopleFix:ConstantinopleFix-MiningChain]
1.28s setup    tests/json-fixtures-over-rpc/test_rpc_fixtures.py::test_rpc_against_fixtures[/home/circleci/repo/fixtures/BlockchainTests/GeneralStateTests/stSStoreTest/sstore_combinations_initial2_d466g0v0.json:sstore_combinations_initial2_d466g0v0_Constantinople:Constantinople-MiningChain]
1.27s setup    tests/json-fixtures-over-rpc/test_rpc_fixtures.py::test_rpc_against_fixtures[/home/circleci/repo/fixtures/BlockchainTests/GeneralStateTests/stSStoreTest/sstore_combinations_initial2_d976g0v0.json:sstore_combinations_initial2_d976g0v0_ConstantinopleFix:ConstantinopleFix-MiningChain]
1.25s setup    tests/json-fixtures-over-rpc/test_rpc_fixtures.py::test_rpc_against_fixtures[/home/circleci/repo/fixtures/BlockchainTests/GeneralStateTests/stSStoreTest/sstore_combinations_initial2_d643g0v0.json:sstore_combinations_initial2_d643g0v0_ConstantinopleFix:ConstantinopleFix-Chain]

Didn't check (yet) the test fixtures themselves as to why.

@veox veox force-pushed the dont-skip-constantinople-tests branch from 265723b to 3f5514d Compare May 20, 2019 20:18
@veox
Copy link
Contributor Author

veox commented May 21, 2019

Rebased on master to get PR #623; sstore job now takes 8-9 minutes (jobs 2675, 2684, 2685):

================== 11253 passed, 7 xfailed in 525.26 seconds ===================
================== 11253 passed, 7 xfailed in 506.88 seconds ===================
================== 11253 passed, 7 xfailed in 516.59 seconds ===================

After switching break-out categories to Python 3.7, sstore job takes 7-8 minutes (jobs 2716, 2718, 2719):

================== 11253 passed, 7 xfailed in 466.06 seconds ===================
================== 11253 passed, 7 xfailed in 436.75 seconds ===================
================== 11253 passed, 7 xfailed in 487.96 seconds ===================

I'm out of good ideas, but maybe that's fast enough?..

EDIT: py3{6,7}-eth2-fixtures takes that long, too, so - yeah, we're good.

@veox veox force-pushed the dont-skip-constantinople-tests branch from 6eec26c to 61a25fc Compare May 21, 2019 11:56
@veox
Copy link
Contributor Author

veox commented May 21, 2019

Cleaned up commits; unmarking WIP; ready for review.

@veox veox changed the title [WIP] tests: update to match py-evm (don't skip Constantinople, add Petersburg) [WIP] tests: update to match py-evm: don't skip Constantinople, add Petersburg, use --fork, break out long-running categories May 21, 2019
@veox veox changed the title [WIP] tests: update to match py-evm: don't skip Constantinople, add Petersburg, use --fork, break out long-running categories tests: update to match py-evm: don't skip Constantinople, add Petersburg, use --fork, break out long-running categories May 21, 2019
@veox veox changed the title tests: update to match py-evm: don't skip Constantinople, add Petersburg, use --fork, break out long-running categories tests: update to match py-evm: don't skip Constantinople, add Petersburg, use --fork; break out long-running categories May 21, 2019
Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to :shipit:

Thanks for the cleanup

@@ -188,13 +205,33 @@ def blockchain_fixture_mark_fn(fixture_path, fixture_name):
return pytest.mark.xfail(reason="Listed in INCORRECT_UPSTREAM_TESTS.")


def generate_ignore_fn_for_fork(passed_fork):
if passed_fork:
passed_fork = passed_fork.lower()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit we'd normally do this as normalized_fork = passed_fork.lower() to avoid destroying the initial data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit we aren't always perfectly consistent about these things. Thanks for the cleanup.

@pipermerriam
Copy link
Member

will merge when green. Thanks a bunch.

@pipermerriam pipermerriam merged commit c3778a5 into ethereum:master May 21, 2019
carver added a commit that referenced this pull request May 21, 2019
@veox veox deleted the dont-skip-constantinople-tests branch May 21, 2019 19:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants