BookKeeper Table Service is a contrib module added to BookKeeper, providing a table (aka key/value) service as part of the stream storage for bookkeeper.
$ mvn clean install -DskipTests
$ bin/streamstorage standalone
CLI is available at bin/streamstorage-cli
Stream is the management unit. A Table
is a materialized view of a Stream
.
bin/streamstorage-cli -s 127.0.0.1:4181 stream create --stream test_dist_counter
bin/streamstorage-cli -s 127.0.0.1:4181 table get -t test_dist_counter -k "counter-1" --watch
bin/streamstorage-cli -s 127.0.0.1:4181 table incr -t test_dist_counter -k "counter-1" -a 100
Use the table service as the normal k/v store for storing metadata
bin/streamstorage-cli -s 127.0.0.1:4181 stream create --stream test_kv_store
bin/streamstorage-cli -s 127.0.0.1:4181 table get -t test_kv_store -k "test-key" --watch
bin/streamstorage-cli -s 127.0.0.1:4181 table put -t test_kv_store -k "test-key" -v "test-value-`date`"
bin/streamstorage-cli -s 127.0.0.1:4181 table get -t test_kv_store -k "test-counter-key" --watch
bin/streamstorage-cli -s 127.0.0.1:4181 table incr -t test_kv_store -k "test-counter-key" -a 200
- API Model: support PTable & Table
- PTable: short for
partitioned table
. the table is modeled as<pKey, lKey> -> value
, kv pairs are partitioned based onpKey
. range operations over a singlepKey
is supported.- put/get/delete on single
<pKey, lKey>
- range/deleteRange on a single
<pKey>
- txn on a single
<pKey>
- increment on single
<pKey, lKey>
- put/get/delete on single
- Table: the table is modeled as
<key> -> value
. kv pairs are also partitioned, based onkey
. range operations are not supported. single key txn (aka cas operation) is supported.
- PTable: short for
- Persistence
- The source-of-truth of a table is its journals. The journal is a stream comprised of multiple log segments (aka ledgers).
- Rocksdb as its materialized index for each range partition. Rocksdb can be ephemeral and it can be restored from checkpoints that are also persisted as log streams.
- Rocksdb can spill in-memory data to disk. Additionally, the rocksdb files are periodically incrementally checkpointed to bookkeeper for fast recovery.
- Clients
- gRPC & protobuf based
- Thick client with client-side request routing
- Java implementation
- Thin client with server-side request routing
- Multiple language clients: C++, Python, Go
- Deployment
- Run table service as a bookkeeper lifecycle component in bookie server
- Stream <-> Table exchangeable: (for metadata store)
- Can retrieve a Stream of updates from a key, a range of keys or a Table.
- A Stream can be materialized and viewed as a Table.
- TTL or Lease: for supporting membership (for metadata store)
- Auto Scale: split and merge ranges. should apply for both Stream and Table.