Skip to content

Support user provided timestamps#4942

Closed
sagar0 wants to merge 1 commit intofacebook:masterfrom
sagar0:timestamp
Closed

Support user provided timestamps#4942
sagar0 wants to merge 1 commit intofacebook:masterfrom
sagar0:timestamp

Conversation

@sagar0
Copy link
Contributor

@sagar0 sagar0 commented Feb 2, 2019

A basic implementation to support user specified timestamps in RocksDB.
This allows to fetch older versions of the keys from RocksDB. RocksDB already provides MVCC support via snapshots and internal sequence numbers, but allowing RocksDB users to plugin their own timestamps is going one step further to allow users on top of RocksDB to build their own MVCC-aware services.

See the tests added in db_basic_test.cc : DBBasicTest.Timestamp1 and DBBasicTest.Timestamp2, but here's a simplified version:

WriteOptions wo;
wo.timestamp = 100;
Put(wo, "k1", "v1");
// overwrite k1
wo.timestamp = 200;
Put(wo, "k1", "v2");

ReadOptions ro;
ro.timestamp = 0; // this is the default anyway
ASSERT_EQ("v2", Get(ro, "k1"));  //gets the latest version

// Request a version older than or equal to timestamp 150
ro.timestamp = 150;
ASSERT_EQ("v1", Get(ro, "k1"));

Sending this out so that we can start a discussion on the API. This is not ready for committing, yet.

In the current implementation:

  • a new option timestamp is added to WriteOptions and ReadOptions.
  • the timestamp is of type uint64_t, but we can decide to make it a user pluggable type to be more generic and if is so desired.
  • the timestamp is added after the sequence number at the end of the internal key.

Still need to think about / implement:

  • The timestamp could be zeroed out similar to the sequence number at the bottom-most level, or based on some condition.
  • A user callback function that tells RocksDB when a key older than some timestamp can be dropped.
  • Would it be cleaner to add a new ValueType ?
  • Should we add new overloaded Get / Put overloaded APIs with timestamp as an argument instead of putting it in Read/Write Options?

@sagar0 sagar0 added the WIP Work in progress label Feb 2, 2019
@adamretter
Copy link
Collaborator

@sagar this looks very interesting to me, could you comment on how you see this being used?

@sagar
Copy link

sagar commented Feb 2, 2019 via email

@adamretter
Copy link
Collaborator

adamretter commented Feb 2, 2019

@sagar so would this allow users to implement MVCC aware structures on top of RocksDB? For example if I wanted to cache some key/value pairs in-memory, I would need to know their version (or timestamp) so that my transaction is reading the correct version.

@sagar0 sagar0 changed the title Support user provided timestamps (versions) Support user provided timestamps Feb 4, 2019
@sagar0
Copy link
Contributor Author

sagar0 commented May 29, 2019

This PR was a prototype to show how user-provided timestamps feature can be implemented in RocksDB.
@riversand963 is working on user-provided timestamps in #5079 . So closing this now.

@sagar0 sagar0 closed this May 29, 2019
facebook-github-bot pushed a commit that referenced this pull request Jun 6, 2019
Summary:
It's useful to be able to (optionally) associate key-value pairs with user-provided timestamps. This PR is an early effort towards this goal and continues the work of #4942. A suite of new unit tests exist in DBBasicTestWithTimestampWithParam. Support for timestamp requires the user to provide timestamp as a slice in `ReadOptions` and `WriteOptions`. All timestamps of the same database must share the same length, format, etc. The format of the timestamp is the same throughout the same database, and the user is responsible for providing a comparator function (Comparator) to order the <key, timestamp> tuples. Once created, the format and length of the timestamp cannot change (at least for now).

Test plan (on devserver):
```
$COMPILE_WITH_ASAN=1 make -j32 all
$./db_basic_test --gtest_filter=Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/*
$make check
```
All tests must pass.

We also run the following db_bench tests to verify whether there is regression on Get/Put while timestamp is not enabled.
```
$TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillseq,readrandom -num=1000000
$TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=1000000
```
Repeat for 6 times for both versions.

Results are as follows:
```
|        | readrandom | fillrandom |
| master | 16.77 MB/s | 47.05 MB/s |
| PR5079 | 16.44 MB/s | 47.03 MB/s |
```
Pull Request resolved: #5079

Differential Revision: D15132946

Pulled By: riversand963

fbshipit-source-id: 833a0d657eac21182f0f206c910a6438154c742c
vagogte pushed a commit to vagogte/rocksdb that referenced this pull request Jun 18, 2019
Summary:
It's useful to be able to (optionally) associate key-value pairs with user-provided timestamps. This PR is an early effort towards this goal and continues the work of facebook#4942. A suite of new unit tests exist in DBBasicTestWithTimestampWithParam. Support for timestamp requires the user to provide timestamp as a slice in `ReadOptions` and `WriteOptions`. All timestamps of the same database must share the same length, format, etc. The format of the timestamp is the same throughout the same database, and the user is responsible for providing a comparator function (Comparator) to order the <key, timestamp> tuples. Once created, the format and length of the timestamp cannot change (at least for now).

Test plan (on devserver):
```
$COMPILE_WITH_ASAN=1 make -j32 all
$./db_basic_test --gtest_filter=Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/*
$make check
```
All tests must pass.

We also run the following db_bench tests to verify whether there is regression on Get/Put while timestamp is not enabled.
```
$TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillseq,readrandom -num=1000000
$TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=1000000
```
Repeat for 6 times for both versions.

Results are as follows:
```
|        | readrandom | fillrandom |
| master | 16.77 MB/s | 47.05 MB/s |
| PR5079 | 16.44 MB/s | 47.03 MB/s |
```
Pull Request resolved: facebook#5079

Differential Revision: D15132946

Pulled By: riversand963

fbshipit-source-id: 833a0d657eac21182f0f206c910a6438154c742c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed WIP Work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants