Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wallet TX Count and time indexing #888

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

nodech
Copy link
Contributor

@nodech nodech commented Mar 4, 2024

This is a port of the bcoin-org/bcoin#605. It includes fixes mentioned in the PR.

This resolves CPU and memory exhaustion issues when requesting transaction history for a wallet with many transactions. Transactions are now also able to be queried by time based on median-time-past (MTP) based on chain data.

Backport:

Issues:

Related:

Tasks:

  • WDB Migration from v3 to v4.

Minor changes

  • Adds get median time to the node http and hs-client node.
  • Adds get entries to the node http and hs-client node.

Wallet changes

  • Add getMedianTime - to get median time past for the height, where height is already stored in the db.
  • Add getMedianTimeTip - to get median time past for the yet to be committed block.
  • Add block time by height cache.
  • Wallet now has additional methods for quering history:
    • listUnconfirmed(acc, { limit, reverse }) - Get first or last limit unconfirmed transactions.
    • listUnconfirmedAfter(acc, { hash, limit, reverse }) - Get first or last limit unconfirmed transactions after/before tx with hash: hash.
    • listUnconfirmedFrom(acc, { hash, limit, reverse }) - Get first or last limit unconfirmed transactions after/before tx with hash hash, inclusive.
    • listUnconfirmedByTime(acc, { time, limit, reverse }) - Get first or last limit unconfirmed transactions after/before time, inclusive.
    • listHistory(acc, { limit, reverse }) - Get first or last limit unconfirmed/confirmed transactions.
    • listHistoryAfter(acc, { hash, limit, reverse }) - Get first or last limit unconfirmed/confirmed transactions after/before tx with hash hash.
    • listHistoryFrom(acc, { hash, limit, reverse }) - Get first or last limit confirmed/unconfirmed transactions after/before tx with hash hash, inclusive.
    • listUnconfirmedByTime(acc, { time, limit, reverse }) - Get first or last limit confirmed/unconfirmed transactions after/before time, inclusive.
    • NOTE: Default is ascending order, from the oldest.

Median time past is used by TX Pagination for time indexes. See details below.

Wallet HTTP

  • GET /wallet/:id/tx/history - The params are now time, after, limit, and reverse.
  • GET /wallet/:id/tx/unconfirmed - The params are are same as above.

Deprecated and removed:

  • GET /wallet/:id/tx/range - Instead use the time param for the history and unconfirmed endpoints.
  • GET /wallet/:id/tx/last - Instead use reverse param for the history and unconfirmed endpoints.

Wallet HTTP Client

  • getHistory and Wallet.getHistory no longer accept account, instead accepts object with properties: account, time, after, limit, and reverse.
  • getPending and Wallet.getPending have the same changes as getHistory above.

Deprecate and remove:

  • getLast and Wallet.getLast, see Wallet HTTP note.
  • getRange and Wallet.getRange, see Wallet HTTP note.
Examples
GET /wallet/:id/tx/history?after=<txid>&limit=50&reverse=false
GET /wallet/:id/tx/history?after=<txid>&limit=50&reverse=true

By using after=<txid> we can anchor pages so that results will not shift
when new blocks and transactions arrive. With reverse=true we can change
the order the transactions are returned as latest to genesis. The
limit=<number> specifies the maximum number of transactions to return
in the result.

GET /wallet/:id/tx/history?time=<median-time-past>&limit=50&reverse=false
GET /wallet/:id/tx/history?time=<median-time-past>&limit=50&reverse=true

The param time is in epoch seconds and indexed based on median-time-past
(MTP) and date is ISO 8601 format. Because multiple transactions can share
the same time, this can function as an initial query, and then switch to the
above after format for the following pages.

GET /wallet/:id/tx/unconfirmed?after=<txid>&limit=50&reverse=false
GET /wallet/:id/tx/unconfirmed?after=<txid>&limit=50&reverse=true
GET /wallet/:id/tx/unconfirmed?time=<time-received>&limit=50&reverse=false

The same will apply to unconfirmed transactions. The time is in epoch
seconds and indexed based on when the transaction was added to the wallet.

Wallet RPC

The following new methods have been added:

  • listhistory - List history with a limit and in reverse order.
  • listhistoryafter - List history after a txid (subsequent pages).
  • listhistorybytime - List history by giving a timestamp in epoch seconds (block median time past).
  • listunconfirmed - List unconfirmed transactions with a limit and in reverse order.
  • listunconfirmedafter - List unconfirmed transactions after a txid (subsequent pages).
  • listunconfirmedbytime - List unconfirmed transactions by time they where added.

The following methods have been deprecated:

  • listtransactions - Use listhistory and the related methods and the after argument for results that do not shift when new blocks arrive.

Wallet CLI (hsw-cli)

  • history now accepts new args on top of --account: --reverse, --limit, --after, --after.
  • pending now accepts new args, same as above.

TXDB Change

  • layout.h - will now also store time for the block, instead of just block hash. This allows us to calculate median time past on the wallet side.
  • New entries:
    • layout.I - Latest Unconfirmed indexed. Makes sure every transactions gets assigned new index.
    • z[height][index] -> tx hash (tx by count) - index/query transactions by height and txindex (in a received array). TXIndex for unconfirmed transactions is ever increasing entry in layout.I.
    • Z[account][height][index] -> tx hash (tx by count + account) - same as above, just indexed separately for each account.
    • y[hash] -> count (count for tx) -> look up Count for tx using hash.
    • x[hash] -> undo count (unconfirmed count for tx) - Stores unconfirm index for confirmed transactions in case they get unconfirmed. (Unconfirmed transactions recover their old index and time)
    • g[time][height][index][hash] -> dummy (tx by time) - Query confirmed transactions by time, ensures they replicate the same sorting as the count index.
    • G[account][time][height][index][hash] -> dummy (tx by time + account) - Same as above, just indexed separately for each account.
    • w[time][count][hash] -> dummy (tx by time) - Stores unconfirmed transactions by time.
    • W[account][time][count][hash] -> dummy (tx by time + account) - Same as above, just indexed separately for each account.
    • e[hash] -> undo time (unconfirmed time for tx) - Time when transaction was first seen, unconfirmed/confirmed. Confirmed transactions will recover time index using this.
  • Removed entries:
    • layout.m and layout.M are no longer used.

TX Pagination

Pagination introduces sorted indexes in the database, that is consistently sorted thoroughout the tx history, whether it's confirmed or unconfirmed. It uses Count indexes to achieve the sorted behaviour. Note this is not strictly based on the tx creation. Instead confirmed transactions use: height + index in the block. Even if one tx was created before another, they will be sorted based on their position in the block and height.
Time index for confirmed transactions is based on the "median time past". It ensures, that time increases monotonically with each block. They give us pointer to the Count index.
There's also "next" page, which gives us next results based on the last hash of the previous response. This is another pointer to the Count index, indexed separately from time.

Count index

This can look like this, if we have 3 transactions in block 200100:

  • z[200100][0] - tx1
  • z[200100][1] - tx2
  • z[200100][2] - tx3

This means, our wallet has 3 transactions in block 200100 and their index relatively is 0, 1, 2. If we were to reorg and another transactions gets introduced between tx3 and tx2 in the new block, all old entries will get removed and new state will look like this.

  • z[200100][0] - tx1
  • z[200100][1] - tx2
  • z[200100][2] - new tx3
  • z[200100][3] - old tx3

If you factor in block heights for our transactions, you can see sorted list emerge across different blocks:

  • z[200100][0] - tx1
  • z[200100][1] - tx2
  • z[200101][0] - tx3 (new block)
  • z[200101][1] - tx4 (new block)

LevelDB allows us to query them with ranges(as it sorts by keys), they will be sorted and also reverse and limits.

Unconfirmed transactions also follow the same scheme. But block height is set to maximum possible value for the height: 0xffffffff. On top unconfirmed transactions have counter for the number of unconfirmed transactions have occured, and that's used for new unconfirmed transactions. This count never decreases and allows previous indexes to be filled, if confirmed transactions get disconnected. Confirmed transactions will store their previous value, to properly recover them in the count index history.
This can look like:

  • z[0xffffffff][0] - first ever unconfirmed tx.
  • z[0xffffffff][..] - ...
  • z[0xffffffff][10000] - 10kth tx.

Once they are confirmed, they move to proper block height count index. Disconnected transactions will fill the gaps.
layout.Z is used for account based count index.

@coveralls
Copy link

coveralls commented Mar 4, 2024

Coverage Status

coverage: 71.256% (+1.2%) from 70.033%
when pulling 904756e on nodech:wallet-pagination
into 5294be7 on handshake-org:master.

@nodech nodech mentioned this pull request Mar 18, 2024
@nodech nodech added advanced review difficulty - advanced wallet-db part of the codebase breaking-major Backwards incompatible - Release version labels Mar 18, 2024
nodech added a commit to nodech/hsd that referenced this pull request Mar 20, 2024
HSD: handshake-org#888
BCOIN: bcoin-org/bcoin#605

Co-authored-by: Braydon Fuller <courier@braydon.com>
@nodech nodech force-pushed the wallet-pagination branch 4 times, most recently from 030aa42 to 0fda018 Compare April 1, 2024 16:16
@nodech nodech marked this pull request as ready for review April 1, 2024 17:51
@nodech
Copy link
Contributor Author

nodech commented Apr 11, 2024

TODO

@nodech nodech force-pushed the wallet-pagination branch 2 times, most recently from 26e2f02 to d4ac794 Compare April 26, 2024 10:34
@nodech nodech requested a review from rithvikvibhu May 10, 2024 08:50
@nodech nodech added this to the hsd 7.0.0 milestone Jun 5, 2024
@nodech nodech marked this pull request as draft July 1, 2024 15:01
nodech and others added 10 commits August 29, 2024 13:30
This requires full wdb block entry wipe and rescan. That is handled by
PR handshake-org#889. `layout.h` is looked up by height, so only missing data was
time. Now we can implement walletdb only median time past calculation.
HSD: handshake-org#888
BCOIN: bcoin-org/bcoin#605

Co-authored-by: Braydon Fuller <courier@braydon.com>
Co-authored-by: Braydon Fuller <courier@braydon.com>
test: Add tests for the wallet.zap.
pending.

hsw-cli:
  - hsw-cli: `history` now accepts new args on top of `--account`: `--reverse`,
    `--limit`, `--after`, `--after`.
  - hsw-cli: `pending` now accepts new args, same as above.

wallet-http:
  - Deprecate and remove: `GET /wallet/:id/tx/range`
  - Deprecate and remove: `GET /wallet/:id/tx/last`
wallet-rpc: The following new methods have been added:
  - `listhistory` - List history with a limit and in reverse order.
  - `listhistoryafter` - List history after a txid _(subsequent pages)_.
  - `listhistorybytime` - List history by giving a timestamp in epoch seconds
    _(block median time past)_.
  - `listunconfirmed` - List unconfirmed transactions with a limit and in
    reverse order.
  - `listunconfirmedafter` - List unconfirmed transactions after a txid
    _(subsequent pages)_.
  - `listunconfirmedbytime` - List unconfirmed transactions by time they
    where added.

wallet-rpc: The following methods have been deprecated:
  - `listtransactions` - Use `listhistory` and the related methods and the
    `after` argument for results that do not shift when new blocks arrive.

wallet: Remove getHistory and related methods form wallet and txdb.
test: fix NodeContext usage and update tests.
@nodech
Copy link
Contributor Author

nodech commented Aug 29, 2024

Rebased on current master (5294be7)

@nodech nodech marked this pull request as ready for review September 22, 2024 12:25
@nodech
Copy link
Contributor Author

nodech commented Nov 7, 2024

BDB v1.6 support multi-byte keys (instead of defining them as prefixes). This
allows us to group related keys together without going through prefix defining
trouble.

I think it will appropriate to group all time and count related indexes
together in TXDB layout (File: lib/wallet/layout.js).
Available prefixes to use for all these are the following:

  D F G I J K L M N O Q S V W X Y Z
  a e f g j k l m n q u w x y z

If we decide to use O (o is taken by names) for example, this is how the
final result will look like:

/**
 * TXDB Database Layout:
 *   ...
 *
 *   Time and Count Index
 *   --------------------
 *   Ol -> Latest Unconfirmed Index
 *   Oc[hash] -> count (Count for tx)
 *   Ou[hash] -> undo count (Unconfirmed count for tx)
 *   Ox[height][index] -> tx hash (tX by count)
 *   OX[account][height][index] -> tx hash (tX by count + account)
 *
 *   Time and Count Index Confirmed
 *   ------------------------------
 *   Oi[time][height][index][hash] -> dummy (tx by tIme)
 *   OI[account][time][height][index][hash] -> dummy (tx by tIme + account)
 *
 *   Time and Count Index Unconfirmed
 *   ------------------------------
 *   Om[time][count][hash] -> dummy (tx by tiMe)
 *   OM[account][time][count][hash] -> dummy (tx by tiMe + account)
 *   Oe[hash] -> undo time (unconfirmed timE for tx)
 *
 *   ...
 */
exports.txdb = {
  prefix: bdb.key('t', ['uint32']),
  // ...

  // Count and Time Index
  Ol: bdb.key('Ol'),
  Oc: bdb.key('Oc', ['hash256']),
  Ou: bdb.key('Ou', ['hash256']),
  Ox: bdb.key('Ox', ['uint32', 'uint32']),
  OX: bdb.key('OX', ['uint32', 'uint32', 'uint32']),

  // Count and Time Index Confirmed
  Oi: bdb.key('Oi', ['uint32', 'uint32', 'uint32', 'hash256']),
  OI: bdb.key('OI', ['uint32', 'uint32', 'uint32', 'uint32', 'hash256']),

  // Count and Time Index Unconfirmed
  Om: bdb.key('Om', ['uint32', 'uint32', 'hash256']),
  OM: bdb.key('OM', ['uint32', 'uint32', 'uint32', 'hash256']),
  Oe: bdb.key('Oe', ['hash256']),
  // ...
};

Using 2 byte prefixes don't seem excessive.

Suggestions for prefix are welcome ! (O - from cOunt, e from timE) ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
advanced review difficulty - advanced breaking-major Backwards incompatible - Release version wallet-db part of the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[WIP] Pagination for HTTP endpoints
2 participants