Skip to content

Conversation

@fselmo
Copy link
Contributor

@fselmo fselmo commented Jan 27, 2026

🗒️ Description

  • Build index incrementally on the fly, across runners, and merge at the end. Remove the bottleneck at the end of filling, re-reading all files in a single main thread.
  • Defer index model validation until merge, append-only:
    • Claude summary is worth it here:
      Before: Each write_partial_index() call (which happens per module scope) had to:                                                                                                                                                                                                                                          
        1. Acquire FileLock                                                                                                                                                                                                                                                                                                       
        2. Read the entire existing JSON file                                                                                                                                                                                                                                                                                     
        3. Parse it (json.loads)                                                                                                                                                                                                                                                                                                  
        4. Append new entries to the list                                                                                                                                                                                                                                                                                         
        5. Serialize the entire thing back (json.dumps)                                                                                                                                                                                                                                                                           
        6. Write the whole file                                                                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                                                                            
        Every write re-reads and re-writes everything written so far. For a worker that processes 500 modules, the 500th write re-serializes all entries from the previous 499.                                                                                                                                                   
                                                                                                                                                                                                                                                                                                                            
      After: Each call just:                                                                                                                                                                                                                                                                                                    
        1. Acquire FileLock                                                                                                                                                                                                                                                                                                       
        2. Open file in append mode ("a")                                                                                                                                                                                                                                                                                         
        3. Write new lines                                                                                                                                                                                                                                                                                                        
                                                                                                                                                                                                                                                                                                                            
        No reading, no parsing, no re-serialization of existing data. O(new entries) per write instead of O(all entries so far).```
      
  • Use a trie-building approach for HashableItem.from_index_entries() and add tests to ensure same hashes are produced - from O(n^2) to O(n):
    • Every folder in the hierarchy used to scan all files twice. From Claude: Each entry is now inserted once by walking its path components (bounded depth, typically 3-5 levels). The conversion visits each trie node exactly once. Total: O(n). No redundant scanning, no relative_to(), no ValueError try/except.
  • Use --no-html for releases since we don't publish the html in the artifacts, this is not needed (small win)
  • Validate at IndexFile once, not at every single file (just one validation instead of potentially +100k times)
  • Distribute slow-marked tests to workers first so we don't hang at the end of test runs (no bricked slow tail)
  • Also build fixtures in parts via workers and merge at the end... this reduces test teardown phase significantly (seeing some improvements from ~80s to ~1` in some longer tests).
  • Turns on --durations=50 for py3 run so we can see the slowest 50 durations. Anything in the ~200s and above range was marked as a slow test in this PR.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

Screenshot 2026-01-27 at 10 46 09

@fselmo fselmo added C-feat Category: an improvement or new feature A-test-fill Area: execution_testing.cli.pytest_commands.plugins.filler labels Jan 27, 2026
@codecov
Copy link

codecov bot commented Jan 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.07%. Comparing base (7c8ec4f) to head (9e11e4e).
⚠️ Report is 2 commits behind head on forks/amsterdam.

Additional details and impacted files
@@                 Coverage Diff                 @@
##           forks/amsterdam    #2079      +/-   ##
===================================================
- Coverage            86.14%   86.07%   -0.07%     
===================================================
  Files                  599      599              
  Lines                39472    39472              
  Branches              3780     3780              
===================================================
- Hits                 34002    33977      -25     
- Misses                4848     4862      +14     
- Partials               622      633      +11     
Flag Coverage Δ
unittests 86.07% <ø> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@marioevz marioevz self-requested a review January 27, 2026 14:15
@fselmo fselmo marked this pull request as ready for review January 27, 2026 17:45
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments but overall looks excellent, thanks!

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 27, 2026
@fselmo
Copy link
Contributor Author

fselmo commented Jan 27, 2026

Thanks @marioevz! I believe I added all your suggestions, if you can double check the last commit 👀

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 27, 2026
@fselmo fselmo force-pushed the feat/speed-up-filling branch from 2e5ae21 to c47b5af Compare January 27, 2026 18:57
fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 27, 2026
@fselmo fselmo force-pushed the feat/speed-up-filling branch from c47b5af to 9e23f59 Compare January 27, 2026 21:30
@fselmo fselmo marked this pull request as draft January 28, 2026 01:02
@fselmo
Copy link
Contributor Author

fselmo commented Jan 28, 2026

Still looking into some things after rebasing with #2080, will re-mark as ready when it's ready to go.

fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 28, 2026
@fselmo fselmo force-pushed the feat/speed-up-filling branch 8 times, most recently from a7a1d61 to 90bfd72 Compare January 28, 2026 16:25
fselmo added a commit to fselmo/execution-specs that referenced this pull request Jan 28, 2026
@fselmo fselmo force-pushed the feat/speed-up-filling branch from 90bfd72 to fa66382 Compare January 28, 2026 17:36
@fselmo fselmo marked this pull request as ready for review January 28, 2026 17:37
@fselmo
Copy link
Contributor Author

fselmo commented Jan 28, 2026

@marioevz ok this is ready for review again!

@fselmo fselmo requested a review from marioevz January 28, 2026 17:37
- xdist workers were writing to the same fixture JSON file causing O(n²) work...
  each worker had to read, parse, and rewrite all previous entries.

- now workers write to their own partial JSONL file (append-only, O(1))
  - test_blob_txs.partial.gw0.jsonl
  - test_blob_txs.partial.gw1.jsonl
  - etc.
  .. and at session end, ``merge_partial_fixture_files()`` combines all partials
  into the final JSON file

Test teardown on some tests dropped from ~80s to ~1s
@fselmo fselmo force-pushed the feat/speed-up-filling branch from fa66382 to 00a835c Compare January 28, 2026 18:47
@fselmo fselmo marked this pull request as draft January 29, 2026 00:12
@fselmo
Copy link
Contributor Author

fselmo commented Jan 29, 2026

@marioevz or anyone reviewing... the last commit here essentially removes the smaller gains from two commits before it (here) with pre-serialization (holding in memory).

I only did this to help our good friend, the pypy3 CI run ::dies inside::, as this was happening again - which is a similar thing we ran into when attempting to turn on --until=Amsterdam recently. It seems the runners can't handle memory independently and workers start crashing and brick the run.

If we find another solution to help pypy, that commit took 80s test teardowns to 60s, which the following commit for partial fixture building across workers (here) brought down to ~1s. Removing this small gain, these same tests are still at ~5s teardown so it's still a huge gain. It's quite unfortunate to have to do this just for pypy backend... but let's see if it even helps. I will wait until this CI runs and see what we can do here. Will put this back to draft for now.


Putting some py3 CI performance comparisons here so we can compare cost / benefit:

@fselmo
Copy link
Contributor Author

fselmo commented Jan 29, 2026

Putting some py3 CI performance comparisons here so we can compare cost / benefit:

* Before PR [feat(test-cli): add `hasher compare` subcommand #2080](https://github.com/ethereum/execution-specs/pull/2080) ---> [0:59:20](https://github.com/ethereum/execution-specs/actions/runs/21374446126/job/61527159243#step:5:5476)

* After PR [feat(test-cli): add `hasher compare` subcommand #2080](https://github.com/ethereum/execution-specs/pull/2080) was merged ---> ([0:54:15](https://github.com/ethereum/execution-specs/actions/runs/21410127576/job/61644855811#step:5:5476))

* This PR before this last commit that removed the pre-serialization ---> ([0:36:57](https://github.com/ethereum/execution-specs/actions/runs/21451210455/job/61780369075?pr=2079#step:5:2519))

* With the last commit (removes the small pre-serialization performance win) ---> [0:38:40](https://github.com/ethereum/execution-specs/actions/runs/21460114169/job/61810524445#step:5:2519)

Seems like there's less than a 2min difference between dropping the pre-serialization and not and it seems to be better for pypy3. I feel this is a decent enough compromise to get this PR through but if we feel like keeping it since pypy3 runs will improve, I'll leave this to the reviewer. This is ready for review again 😅.

@fselmo fselmo marked this pull request as ready for review January 29, 2026 03:29
@danceratopz danceratopz self-requested a review January 29, 2026 13:47
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, testing this locally it really helps out to speed up the process!

@fselmo fselmo merged commit b68a532 into ethereum:forks/amsterdam Jan 29, 2026
22 of 23 checks passed
@fselmo fselmo deleted the feat/speed-up-filling branch January 29, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-fill Area: execution_testing.cli.pytest_commands.plugins.filler C-feat Category: an improvement or new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants