Time spent: 3 evenings (about 6 hours)
kresolver runs as a one-shot CLI, and internally implements an API for resolving ledgers. The implementation is
in-memory, collect-free, clone-free, and streaming I/O on both the read and write steps. The top-level of the API is
the Book
, which maintains state of accounts and has a fixed-size LRU cache for references to past transactions.
Regarding this LRU cache, a decision was made w.r.t. balancing (1) transaction amount accuracy, (2) keeping a perfect record of past transactions, and (3) performance -- kresolver assumes that progress on any disputed transaction will be made before 1Mn other records are processed. Otherwise, dispute-related records may be skipped as invalid. This design optimises for performance and accuracy of transaction amounts.
In hindsight, using big-decimals for transaction amounts is overkill - the all-time total volume of all ethereum transactions could fit into one such big-decimal number. On the other hand, using a smaller representation would not solve the problem of dealing with billions of transaction IDs in-memory.
Note for clarity: nothing in this readme section is implemented.
Further to the current implementation discussed above, the design could be extended in a number of ways:
- The cache miss behavior could trigger a read from disk as backup storage for the transaction cache. sqlite would be a good choice for this.
- (easier) One could just use
swap
and increase the cache size to 5Bn, which would negatively impact performance but easily fit all possible 32bit transactions onto a single disk (~70GB). - (faster) A creative use of encoding could compress the transaction cache to hold it in physical memory. For example, you could periodically shift out blocks of 1000 transactions from the cache into a secondary memory storage where each block is brotli-compressed.
Further, if we relax the specification a little by only guaranteeing in-order processing of deposits and withdrawals, we could instead split the dispute resolution and transaction storage layer into separate services, and then process dispute-related records asynchronously. This would be more elegant, faster in mainline usage, and decompose into a more maintainable architecture -- but the trade-off would be to break correctness (per the specification) due to the out-of-order processing of dispute resolution steps.
- I use unit tests to check the main use-cases.
- I use the type system to balance correctness, performance, error-handling, and API elegance rather than optimising
just one of these.
- e.g.: I prefer to exit if there's a malformed CSV record rather than risk parsing the CSV incorrectly.
- e.g.: the
Book
API exposes awrite<T: Write>(write: T)
instead of acollect()
orto_vec()
method. The minor inconvenience of managing a mut buffer in test cases affords better performance and flexibility for writing large books. - e.g.: there's an assert to ensure that the account specified in dispute-related records matches the account on the disputed transaction -- the application will panic if you try to dispute a transaction on behalf of an innocent account.
- The main shortcoming w.r.t. correctness is inability to handle dispute resolution for very non-recent transactions. I explain this further above.
If you wanted to run this as a long-lived service with active TCP connections, some new challenges/goals are surfaced:
- It's no longer desirable to panic, barring extreme circumstances - instead, it would be better to fail individual ledger resolutions and "400" the problematic requests.
- If it's desireable to merge separate ledgers and account states, you'd need to internally queue requests or block concurrent writes on a shared storage layer. You'd also want to carefully ensure that transactionality is achieved if it's possible to panic or fail a ledger resolution.
- You'd want observability -- starting with metrics and event logs.
- You'd want to QoS and reconnect clients as needed.
On my WSL VM with SSD, kresolver processes about 500k records per second.
$ hyperfine './target/release/kresolver tests/350k_transactions.csv'
Benchmark 1: ./target/release/kresolver tests/350k_transactions.csv
Time (mean ± σ): 689.6 ms ± 24.0 ms [User: 308.0 ms, System: 50.4 ms]
Range (min … max): 657.5 ms … 727.7 ms 10 runs