Skip to content

RocksDB

William Zhang edited this page Nov 24, 2017 · 3 revisions

Common

https://github.com/facebook/rocksdb/INSTALL.md https://github.com/facebook/rocksdb/wiki/RocksJava-Basics

Checkout a stable branch/tag since the latest version (master) is not stable.

$ git checkout -b v4.11.2 v4.11.2

Windows

VS2013 x64 Native Tools Command Prompt
D:\> mkdir build
D:\> cd build
D:\> cmake .. -G "Visual Studio 12 2013 Win64" -DJNI=1
D:\> msbuild /m rocksdb.sln

Linux

$ scl enable devtoolset-3 bash
$ cmake .
$ make -j clean
$ make -j check
$ make -j rocksdbjava # For rocksdbjni

MacOS

  1. Remove “set(SYSTEM_LIBS ${CMAKE_THREAD_LIBS_INIT} rt)” from CMakeLists.txt if exists.
  2. Clone gflags 2.0, configure and make install it. “brew install gflags” not enough for dev. gflags is needed by tools/utilities.
  3. brew install zlib, lz4, snappy, zstd.
  4. See CMakeLists.txt and thirdparty.inc to turn on features like zstd. Turn on the features in CMakeLists.txt: OFF => ON, or list them in command line options like below.
$ rm -rf CMakeCache.txt CMakeFiles/
$ CXXFLAGS="-DGFLAGS=google" LDFLAGS="-lgflags" cmake .
$ CXXFLAGS="-DGFLAGS=google -DJEMALLOC -DZLIB -DSNAPPY -DLZ4 -DZSTD" LDFLAGS="-lgflags -ljemalloc -lz -lsnappy -llz4 -lzstd" cmake .
$ make VERBOSE=1 -j

jdb_bench.sh (Sample code for macOS.)

$ make -j rocksdbjava
$ cd java
$ make db_bench
$ ./jdb_bench.sh
  1. Since there are multiple jars in target, make a change in jdb_bench.sh:
    -ROCKS_JAR=`find target -name rocksdbjni*.jar`
    +ROCKS_JAR=target/rocksdbjni-5.4.0-osx.jar
        
  2. And compressors should be there in target/.
    libsnappy.dylib -> /usr/local/Cellar/snappy/1.1.4/lib/libsnappy.1.dylib
        

utility

$ brew install rocksdb
# Assuming there is a db with <k, v> = <long in big endian, int in little endian>
$ rocksdb_ldb dump --db=/tmp/long-int/ --hex >x
$ head x
0x0000000000000000 ==> 0x00000000
0x0000000000000001 ==> 0x01000000
0x0000000000000002 ==> 0x02000000
0x0000000000000003 ==> 0x03000000
0x0000000000000004 ==> 0x04000000
0x0000000000000005 ==> 0x05000000
0x0000000000000006 ==> 0x06000000
0x0000000000000007 ==> 0x07000000
0x0000000000000008 ==> 0x08000000
0x0000000000000009 ==> 0x09000000

$ rocksdb_dump --db_path long-int/ --dump_location /tmp/dump.out
$ rocksdb_undump --db_path long-int-new/  --dump_location=/tmp/dump.out

flush

以 Big/Little endian 的形式保存 4 字节的整数 Key(以及 4 字节的整数 Value),持续不断的插入数据。发现两个现象:

  1. Big endian 形式的 Key 写出的 L0 级别的文件比较小,大约为 1M。 Little endian 形式的 Key 写出的 L0 级别的文件比较大,大约为 4M。
  2. Big endian 形式的 Key 写出的文件数目很多,每个大小都差不多,看起来像是没有 compact 一样。Little endian 的文件数目明显较少。

从 RocksDB 的日志看到:

[default] [JOB 2] Flushing memtable with next log file: 6 EVENT_LOG_v1 {“time_micros”: 2495769399809332, “job”: 2, “event”: “flush_started”, “num_memtables”: 1, “num_entries”: 117247, “num_deletes”: 0, “memory_usage”: 4065400}

两种方式下内存表的 memory_usage 都约为 4M,与参数配置 max_write_buffer_size 一致。但是刷出去的文件大小有明显差异。需要查找代码,分析具体原因。Status FlushJob::WriteLevel0Table()。

Big endian 方式写出去的文件记录是基本有序的,因此不同的 L0 文件之间没有交集,合并到 L1 等级别的时候,文件保持了原始大小(直接拷贝了)。

Clone this wiki locally