Skip to content

Conversation

Mirko-von-Leipzig
Copy link
Collaborator

@Mirko-von-Leipzig Mirko-von-Leipzig commented Sep 18, 2025

This PR replaces the existing mempool implementation in an effort to enable user submitted batches #1112.

This PR leaves the mempool in an incomplete state, and as such some follow-up PRs will be made. This is already enough (or too much) to review at once. I'll stack the follow-up PRs on this one before we merge it all.

Status Quo

The original implementation operated on the assumption that all state could be represented as a DAG of transactions. Unfortunately this assumption no longer holds true with user submitted batches, invalidating the original design.

New design

The mempool is represented as a DAG of nodes, where a node can represent an unbatched transaction, a batch or a block respectively. The nodes reflect the active state of the system e.g. if a batch is selected, all the transaction nodes are replaced by the batch node. The new node retains the replaced nodes to (a) know what its state impact is, and (b) to allow reverting/re-inserting.

Edges in the DAG are stored by the inflight state (unimplemented). Each piece of state stores the node ID that created and consumed it. A node can therefore lookup its children by looking up the consumers of the data it created. And vice versa for parent lookup. This layer of indirection makes it easy to replace nodes in the graph e.g. replacing a set of transaction nodes with a single batch is simply updating the node ID associated with the node.

The original goal was decoupling the current transaction and batch graphs (#595), however this new design (imo) creates a more sensible single graph layout.

What's missing

The inflight state is not yet implemented; but shouldn't be overly complicated - I've implemented several similar on my way here already.

The user submitted batches aren't implemented yet; though I hope you can see it won't be difficult to squeeze into the current design. I've purposefully left this out to allow comparison with the current requirements.

Reverting isn't handled yet; this will be complicated a bit by user batches so I'm waiting until then to re-add that.

Mempool tests are disabled for now.

Mempool subscription will also need an overhaul to support user batches -- its still unclear how this should look.


Closes #595, closes #537, closes #594

Supersedes #1210.

@Mirko-von-Leipzig Mirko-von-Leipzig added no changelog This PR does not require an entry in the `CHANGELOG.md` file block-producer Related to the block producer component labels Sep 18, 2025
@Mirko-von-Leipzig Mirko-von-Leipzig marked this pull request as ready for review September 18, 2025 16:02
Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I took a first look, but will have to take another one later to wrap my head around this.

Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

The overall structure makes sense to me and the code is well written, imo. I think I mainly left minor comments and questions. I'll have to take another look to think through some other parts, given that this has a large surface area.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you for addressing all the previous comments. Still not a full review, but I left a few more questions/comments inline.

Also, is there anything else you are planning to add to this PR? Or should we merge it once the reviews are done, and the follow-up work is being done in another PR?

@Mirko-von-Leipzig
Copy link
Collaborator Author

Mirko-von-Leipzig commented Sep 25, 2025

Also, is there anything else you are planning to add to this PR? Or should we merge it once the reviews are done, and the follow-up work is being done in another PR?

I'm happy to merge this as is; outstanding work is

  • user batch support
  • re-adding the telemetry stuff
  • figuring out mempool events
  • tests

The main fixes would come from whatever issues the tests pick up; but those should be mostly minor.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a couple more questions inline. Overall, I didn't review everything too deeply - so, another set of eyes (or two) looking through this would be appreciated.

We can merge this PR into a new branch (e.g., mempool-refactoring) and then merge subsequent PRs into that branch until it is ready to be merged into next.

As discussed offline, I think the next steps could be:

  1. Re-enabled mempool events.
  2. Re-add telemetry.
  3. Implement tests - at this point we should be able to merge mempool-refactoring into next.
  4. Add user batch support (which may require some refactoring of mempool events).

Also, not for this PR, but it would be cool to add some mempool related metrics to the BlockProducerStatus protobuf message. For example:

  • Number of unbatched transactions.
  • Number of batches.
  • Maybe total number of transactions (batched or unbatched).

Let's create an issue for this (unless we already have one).

@Mirko-von-Leipzig Mirko-von-Leipzig changed the base branch from next to refactor/mempool September 26, 2025 10:42
@Mirko-von-Leipzig
Copy link
Collaborator Author

Looks good! Thank you! I left a couple more questions inline. Overall, I didn't review everything too deeply - so, another set of eyes (or two) looking through this would be appreciated.

@igamigo @drahnr @PhilippGackstatter I would appreciate another look if you have the time. I'll leave this PR open until early next week but since its merging into a feature branch reviews after the fact can still be addressed.

Also, not for this PR, but it would be cool to add some mempool related metrics to the BlockProducerStatus protobuf message. <...>

Created issue #1244

Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

I tried to trace through some cases I thought were interesting and I didn't find any issues, but I also definitely didn't think through all of them. I think a test suite is necessary to cover everything here. In any case, the general abstractions over txs, batches and blocks looks great and I found the structure reasonably (given the complexity involved) easy to follow.

Copy link
Collaborator

@igamigo igamigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the most in-depth review, but overall looks solid and fairly easy to follow; might take a deeper look later. Left a comment on what might be a bug.

@Mirko-von-Leipzig Mirko-von-Leipzig merged commit 3560573 into refactor/mempool Sep 28, 2025
15 checks passed
@Mirko-von-Leipzig Mirko-von-Leipzig deleted the mirko/mempool/state-dag branch September 28, 2025 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

block-producer Related to the block producer component no changelog This PR does not require an entry in the `CHANGELOG.md` file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate graph decoupling Improve tx reverting strategy [block-producer]: improve InflightState error handling

6 participants