Skip to content

Easy parallelizability #648

@vbuterin

Description

@vbuterin

Parameters

  • MAX_THREADS: 8

Specification

Introduce a new type of transaction (following same general format as #232):

[2, network_id, startgas, to, data, ranges]

Where ranges is an RLP list, where each value in the list is an address prefix, representing the range of all addresses with that prefix. That is, for example:

  • \x01 refers to the set of addresses from 0x0100000000000000000000000000000000000000 to 0x01ffffffffffffffffffffffffffffffffffffff
  • \x02\x48\xab refers to the set of addresses from 0x0248ab0000000000000000000000000000000000 to 0x0248abffffffffffffffffffffffffffffffffff
  • 'l#\xfa\xce\x01O \xb3\xeb\xb6Z\xe9m\r\x7f\xf3*\xb9L\x17' refers to the one-element set containing only the address 0x6c23face014f20b3ebb65ae96d0d7ff32ab94c17
  • The empty string refers to the set of all addresses

Prefixes longer than 20 bytes are illegal (ie. transactions containing such prefixes are invalid). We call the "range of a transaction" the union of all ranges specified in that transaction if the transaction is of this new type, and the set of all addresses otherwise. For example, if a transaction's ranges field is of the form [\x01, \x02\x03, \xff\x8c\x45], its range is the set of all addresses that start with either 0x01, or 0x0203, or 0xff8c45.

We keep track of a starting_gas_height and finishing_gas_height for each transaction. We define the starting_gas_height of a transaction T to be the maximum of all finishing_gas_heights of all transactions T' before T such that either (i) T' is more than MAX_THREADS behind T (ie. txindex_of(T) - txindex_of(T') > MAX_THREADS) or (ii) the ranges of T and T' intersect. We define the finishing_gas_height of T to be the starting_gas_height of T plus the amount of gas consumed while executing T.

The current rule that a transaction is invalid if its start_gas plus the current total_gas_used exceeds the block gas_limit is removed, and replaced with a rule that a transaction T is invalid if: T.starting_gas_height + T.start_gas > gas_limit. Notice that in the case where all transactions use the entire address space as their range, this is exactly equivalent to the current status quo, but in the case where transactions use disjoint ranges, this increases capacity.

Any CALL, CREATE, CALLCODE, DELEGATECALL or STATICCALL that attempts to access an account outside a transaction's range fails; any EXTCODESIZE, EXTCODECOPY, BALANCE or SELFDESTRUCT that attempts to access an account outside a transaction's range immediately throws an exception. Addresses 0...127 have one exception: any account can STATICCALL them. We add a binding norm on protocol development that no address may be introduced into that range which has logic such that it may return a differing value depending on transactions that take place within a block (though changes between blocks are acceptable).

Rationale

This allows transactions in the EVM to be processed in parallel much more easily, by specifying statically what addresses they can access; it also appropriately incentivizes making transactions that are easy to parallelize. It does this in a maximally backwards-compatible and non-intrusive way where old-style transactions can continue to work.

The exception for addresses 0...127 is introduced to allow access to precompiles and to system contracts like BLOCKHASH.

Note that this EIP does change the role of the block gas limit from being a limit on total gas consumed to being a limit on gas height. This means that the theoretical max capacity of the blockchain would increase by a factor of NUM_THREADS, though if we want to avoid increasing uncle rates it does require the assumption that miners are running on hardware that has NUM_THREADS CPU cores. For clarity, it may be a good idea to rename the block gas_limit to gas_height_limit.

Client implementation

A client can follow the following algorithm:

  1. Let TXINDEX = 0
  2. Initialize txgrab as an empty list.
  3. WHILE (i) the transaction in the block at index TXINDEX does not intersect with any transaction in txgrab, (ii) len(txgrab) < NUM_THREADS and (iii) TXINDEX < len(block.transactions), add the transaction in the block at index TXINDEX to txgrab and set TXINDEX += 1.
  4. Execute all transactions in txgrab in parallel.
  5. If TXINDEX == len(block.transactions), exit. Otherwise, go back to step 2.

Specification, v2

Modify the new transaction type to be as follows:

[2, network_id, startgas, to, data, read_ranges, write_ranges]

The read range of a transaction is the union of all read_ranges, and the write range of a transaction is the union of all write_ranges. Replace "the ranges of T and T' intersect" above with "the read or write range of T intersects with the write range of T' " (where T is the current transaction and T' is a previous transaction). Read-only operations are allowed to access the read or write ranges, write-only operations are allowed to access the write range only.

This adds a further optimization where if two transactions' read ranges intersect at data that is not written to, then they can still be processed in parallel.

The exception for addresses 0...127 can be removed, as any contracts that wish to use the precompiles can simply include the addresses in their read range, and even if all transactions include those addresses in their read range parallelizability will not be significantly impacted as almost no one wants to write to those addresses.

Proposed amendment 1 (thanks @Arachnid for suggestion)

Instead of each range being an RLP list, it is simply a byte array, where each byte represents a byte prefix (eg. the byte array \x03\x35\xfe means "the set of all addresses starting with 0x03, 0x35 or 0xfe"). An empty byte array represents the set of all addresses. This simplifies implementation at some cost to granularity, though the value of that level of granularity is arguably low.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions