feat(test-fill): speed up filling #2079

fselmo · 2026-01-27T00:30:56Z

🗒️ Description

Build index incrementally on the fly, across runners, and merge at the end. Remove the bottleneck at the end of filling, re-reading all files in a single main thread.

Defer index model validation until merge, append-only:

Claude summary is worth it here:

Before: Each write_partial_index() call (which happens per module scope) had to:                                                                                                                                                                                                                                          
  1. Acquire FileLock                                                                                                                                                                                                                                                                                                       
  2. Read the entire existing JSON file                                                                                                                                                                                                                                                                                     
  3. Parse it (json.loads)                                                                                                                                                                                                                                                                                                  
  4. Append new entries to the list                                                                                                                                                                                                                                                                                         
  5. Serialize the entire thing back (json.dumps)                                                                                                                                                                                                                                                                           
  6. Write the whole file                                                                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                                                                      
  Every write re-reads and re-writes everything written so far. For a worker that processes 500 modules, the 500th write re-serializes all entries from the previous 499.                                                                                                                                                   
                                                                                                                                                                                                                                                                                                                      
After: Each call just:                                                                                                                                                                                                                                                                                                    
  1. Acquire FileLock                                                                                                                                                                                                                                                                                                       
  2. Open file in append mode ("a")                                                                                                                                                                                                                                                                                         
  3. Write new lines                                                                                                                                                                                                                                                                                                        
                                                                                                                                                                                                                                                                                                                      
  No reading, no parsing, no re-serialization of existing data. O(new entries) per write instead of O(all entries so far).```

Use a trie-building approach for HashableItem.from_index_entries() and add tests to ensure same hashes are produced - from O(n^2) to O(n):
- Every folder in the hierarchy used to scan all files twice. From Claude: Each entry is now inserted once by walking its path components (bounded depth, typically 3-5 levels). The conversion visits each trie node exactly once. Total: O(n). No redundant scanning, no relative_to(), no ValueError try/except.
Use --no-html for releases since we don't publish the html in the artifacts, this is not needed (small win)
Validate at IndexFile once, not at every single file (just one validation instead of potentially +100k times)
Distribute slow-marked tests to workers first so we don't hang at the end of test runs (no bricked slow tail)
Also build fixtures in parts via workers and merge at the end... this reduces test teardown phase significantly (seeing some improvements from ~80s to ~1` in some longer tests).
Turns on --durations=50 for py3 run so we can see the slowest 50 durations. Anything in the ~200s and above range was marked as a slow test in this PR.

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

codecov · 2026-01-27T01:23:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.07%. Comparing base (7c8ec4f) to head (9e11e4e).
⚠️ Report is 2 commits behind head on forks/amsterdam.

Additional details and impacted files

@@                 Coverage Diff                 @@
##           forks/amsterdam    #2079      +/-   ##
===================================================
- Coverage            86.14%   86.07%   -0.07%     
===================================================
  Files                  599      599              
  Lines                39472    39472              
  Branches              3780     3780              
===================================================
- Hits                 34002    33977      -25     
- Misses                4848     4862      +14     
- Partials               622      633      +11

Flag	Coverage Δ
unittests	`86.07% <ø> (-0.07%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

marioevz

Just a few comments but overall looks excellent, thanks!

packages/testing/src/execution_testing/fixtures/collector.py

packages/testing/src/execution_testing/cli/gen_index.py

fselmo · 2026-01-27T18:49:14Z

Thanks @marioevz! I believe I added all your suggestions, if you can double check the last commit 👀

…s overhead)

fselmo · 2026-01-28T01:03:14Z

Still looking into some things after rebasing with #2080, will re-mark as ready when it's ready to go.

fselmo · 2026-01-28T17:37:09Z

@marioevz ok this is ready for review again!

… tail - mark more slow tests

- xdist workers were writing to the same fixture JSON file causing O(n²) work... each worker had to read, parse, and rewrite all previous entries. - now workers write to their own partial JSONL file (append-only, O(1)) - test_blob_txs.partial.gw0.jsonl - test_blob_txs.partial.gw1.jsonl - etc. .. and at session end, ``merge_partial_fixture_files()`` combines all partials into the final JSON file Test teardown on some tests dropped from ~80s to ~1s

fselmo · 2026-01-29T00:12:17Z

@marioevz or anyone reviewing... the last commit here essentially removes the smaller gains from two commits before it (here) with pre-serialization (holding in memory).

I only did this to help our good friend, the pypy3 CI run ::dies inside::, as this was happening again - which is a similar thing we ran into when attempting to turn on --until=Amsterdam recently. It seems the runners can't handle memory independently and workers start crashing and brick the run.

If we find another solution to help pypy, that commit took 80s test teardowns to 60s, which the following commit for partial fixture building across workers (here) brought down to ~1s. Removing this small gain, these same tests are still at ~5s teardown so it's still a huge gain. It's quite unfortunate to have to do this just for pypy backend... but let's see if it even helps. I will wait until this CI runs and see what we can do here. Will put this back to draft for now.

Putting some py3 CI performance comparisons here so we can compare cost / benefit:

Before PR feat(test-cli): add hasher compare subcommand #2080 ---> 0:59:20
After PR feat(test-cli): add hasher compare subcommand #2080 was merged ---> (0:54:15)
This PR before this last commit that removed the pre-serialization ---> (0:36:57)
With the last commit (removes the small pre-serialization performance win) ---> 0:38:40

fselmo · 2026-01-29T03:29:28Z

Putting some py3 CI performance comparisons here so we can compare cost / benefit:

* Before PR [feat(test-cli): add `hasher compare` subcommand #2080](https://github.com/ethereum/execution-specs/pull/2080) ---> [0:59:20](https://github.com/ethereum/execution-specs/actions/runs/21374446126/job/61527159243#step:5:5476)

* After PR [feat(test-cli): add `hasher compare` subcommand #2080](https://github.com/ethereum/execution-specs/pull/2080) was merged ---> ([0:54:15](https://github.com/ethereum/execution-specs/actions/runs/21410127576/job/61644855811#step:5:5476))

* This PR before this last commit that removed the pre-serialization ---> ([0:36:57](https://github.com/ethereum/execution-specs/actions/runs/21451210455/job/61780369075?pr=2079#step:5:2519))

* With the last commit (removes the small pre-serialization performance win) ---> [0:38:40](https://github.com/ethereum/execution-specs/actions/runs/21460114169/job/61810524445#step:5:2519)

Seems like there's less than a 2min difference between dropping the pre-serialization and not and it seems to be better for pypy3. I feel this is a decent enough compromise to get this PR through but if we feel like keeping it since pypy3 runs will improve, I'll leave this to the reviewer. This is ready for review again 😅.

marioevz

LGTM, testing this locally it really helps out to speed up the process!

fselmo added C-feat Category: an improvement or new feature A-test-fill Area: execution_testing.cli.pytest_commands.plugins.filler labels Jan 27, 2026

marioevz self-requested a review January 27, 2026 14:15

fselmo marked this pull request as ready for review January 27, 2026 17:45

marioevz reviewed Jan 27, 2026

View reviewed changes

packages/testing/src/execution_testing/fixtures/collector.py Outdated Show resolved Hide resolved

packages/testing/src/execution_testing/cli/gen_index.py Outdated Show resolved Hide resolved

packages/testing/src/execution_testing/cli/gen_index.py Outdated Show resolved Hide resolved

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 27, 2026

refactor: changes from comments on PR ethereum#2079

2e5ae21

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 27, 2026

refactor: changes from comments on PR ethereum#2079

c47b5af

fselmo force-pushed the feat/speed-up-filling branch from 2e5ae21 to c47b5af Compare January 27, 2026 18:57

fselmo added 6 commits January 27, 2026 14:20

feat(test-fill): Build index incrementally, not at the end of fill

d53a9e3

perf(test): defer index model validation

696faef

perf(tests): O(n²) to O(n) trie-building approach for perf

6405410

perf(test-fill): Use --no-html; releases don't publish these (needles…

fc81785

…s overhead)

refactor: cleanup; add more hasher compat tests

b93d2ed

refactor(perf): validate at IndexFile once, not at every single file

1ce1776

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 27, 2026

refactor: changes from comments on PR ethereum#2079

9e23f59

fselmo force-pushed the feat/speed-up-filling branch from c47b5af to 9e23f59 Compare January 27, 2026 21:30

fselmo marked this pull request as draft January 28, 2026 01:02

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 28, 2026

refactor: changes from comments on PR ethereum#2079

c46143e

fselmo force-pushed the feat/speed-up-filling branch 8 times, most recently from a7a1d61 to 90bfd72 Compare January 28, 2026 16:25

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 28, 2026

refactor: changes from comments on PR ethereum#2079

64e7399

fselmo force-pushed the feat/speed-up-filling branch from 90bfd72 to fa66382 Compare January 28, 2026 17:36

fselmo marked this pull request as ready for review January 28, 2026 17:37

fselmo requested a review from marioevz January 28, 2026 17:37

fselmo added 5 commits January 28, 2026 11:46

feat(perf): distribute slow-marked tests early to runners; avoid long…

d6e008a

… tail - mark more slow tests

refactor: changes from comments on PR ethereum#2079

bd5c21b

chore(ci): show the slowest 50 tests on every py3 run

0d2437f

refactor(perf): pre-serialize fixture JSON while workers parallelize

d4e3f5a

fselmo force-pushed the feat/speed-up-filling branch from fa66382 to 00a835c Compare January 28, 2026 18:47

fix: remove pre-serialization small win to help pypy runs

9e11e4e

fselmo marked this pull request as draft January 29, 2026 00:12

fselmo marked this pull request as ready for review January 29, 2026 03:29

danceratopz self-requested a review January 29, 2026 13:47

marioevz approved these changes Jan 29, 2026

View reviewed changes

fselmo merged commit b68a532 into ethereum:forks/amsterdam Jan 29, 2026
22 of 23 checks passed

fselmo deleted the feat/speed-up-filling branch January 29, 2026 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(test-fill): speed up filling #2079

feat(test-fill): speed up filling #2079

Uh oh!

fselmo commented Jan 27, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 27, 2026 •

edited

Loading

Uh oh!

marioevz left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fselmo commented Jan 27, 2026

Uh oh!

fselmo commented Jan 28, 2026

Uh oh!

fselmo commented Jan 28, 2026

Uh oh!

fselmo commented Jan 29, 2026 •

edited

Loading

Uh oh!

fselmo commented Jan 29, 2026 •

edited

Loading

Uh oh!

marioevz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(test-fill): speed up filling #2079

feat(test-fill): speed up filling #2079

Uh oh!

Conversation

fselmo commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

✅ Checklist

Cute Animal Picture

Uh oh!

codecov bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fselmo commented Jan 27, 2026

Uh oh!

fselmo commented Jan 28, 2026

Uh oh!

fselmo commented Jan 28, 2026

Uh oh!

fselmo commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fselmo commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fselmo commented Jan 27, 2026 •

edited

Loading

codecov bot commented Jan 27, 2026 •

edited

Loading

fselmo commented Jan 29, 2026 •

edited

Loading

fselmo commented Jan 29, 2026 •

edited

Loading