Skip to content

Commit

Permalink
Problem: (memiavl) offset table is not used to reduce node size
Browse files Browse the repository at this point in the history
Solution:
- store offset table inside keys/values file, so we can reference it with leaf index in nodes.
- leverage the property of post-order traversal to further reduce some offsets.
- specialize `Get` operation for persisted node to improve query performance.
- add native-endian implementation to improve performance.

in the end, the on-disk IAVL tree query performance is on-par with tidwall/btree library
which is used for cache store.
  • Loading branch information
yihuang committed Feb 24, 2023
1 parent 05cff30 commit 9b03c6c
Show file tree
Hide file tree
Showing 19 changed files with 835 additions and 170 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@

- [#833](https://github.com/crypto-org-chain/cronos/pull/833) Fix rollback command.

### Improvements

- [#890](https://github.com/crypto-org-chain/cronos/pull/890) reduce node size in memiavl and improve performance

*Feb 09, 2022*

## v1.0.4
Expand Down
3 changes: 3 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ require (
github.com/99designs/go-keychain v0.0.0-20191008050251-8e49817e8af4 // indirect
github.com/99designs/keyring v1.2.1 // indirect
github.com/ChainSafe/go-schnorrkel v0.0.0-20200405005733-88cbf1b4c40d // indirect
github.com/RoaringBitmap/roaring v1.2.2 // indirect
github.com/StackExchange/wmi v1.2.1 // indirect
github.com/VictoriaMetrics/fastcache v1.6.0 // indirect
github.com/Workiva/go-datastructures v1.0.53 // indirect
Expand All @@ -51,6 +52,7 @@ require (
github.com/beorn7/perks v1.0.1 // indirect
github.com/bgentry/go-netrc v0.0.0-20140422174119-9fd32a8b3d3d // indirect
github.com/bgentry/speakeasy v0.1.1-0.20220910012023-760eaf8b6816 // indirect
github.com/bits-and-blooms/bitset v1.2.2 // indirect
github.com/btcsuite/btcd v0.23.4 // indirect
github.com/btcsuite/btcd/btcec/v2 v2.3.2 // indirect
github.com/btcsuite/btcd/btcutil v1.1.3 // indirect
Expand Down Expand Up @@ -142,6 +144,7 @@ require (
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/go-testing-interface v1.14.1 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/mschoch/smat v0.2.0 // indirect
github.com/mtibben/percent v0.2.1 // indirect
github.com/olekukonko/tablewriter v0.0.5 // indirect
github.com/pelletier/go-toml v1.9.5 // indirect
Expand Down
15 changes: 15 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -220,13 +220,16 @@ github.com/Microsoft/go-winio v0.6.0 h1:slsWYD/zyx7lCXoZVlvQrj0hPTM1HI4+v1sIda2y
github.com/Nvveen/Gotty v0.0.0-20120604004816-cd527374f1e5 h1:TngWCqHvy9oXAN6lEVMRuU21PR1EtLVZJmdB18Gu3Rw=
github.com/OneOfOne/xxhash v1.2.2 h1:KMrpdQIwFcEqXDklaen+P1axHaj9BSKzvpUUfnHldSE=
github.com/OneOfOne/xxhash v1.2.2/go.mod h1:HSdplMjZKSmBqAxg5vPj2TmRDmfkzw+cTzAElWljhcU=
github.com/RoaringBitmap/roaring v1.2.2 h1:RT+1qfb7a8rkOIxPnyJdvU4G8Ynmhc2YYP6MvzqEtwk=
github.com/RoaringBitmap/roaring v1.2.2/go.mod h1:plvDsJQpxOC5bw8LRteu/MLWHsHez/3y6cubLI4/1yE=
github.com/Shopify/sarama v1.19.0/go.mod h1:FVkBWblsNy7DGZRfXLU0O9RCGt5g3g3yEuWXgklEdEo=
github.com/Shopify/toxiproxy v2.1.4+incompatible/go.mod h1:OXgGpZ6Cli1/URJOF1DMxUHB2q5Ap20/P/eIdh4G0pI=
github.com/StackExchange/wmi v0.0.0-20180116203802-5d049714c4a6/go.mod h1:3eOhrUMpNV+6aFIbp5/iudMxNCF27Vw2OZgy4xEx0Fg=
github.com/StackExchange/wmi v1.2.1 h1:VIkavFPXSjcnS+O8yTq7NI32k0R5Aj+v39y29VYDOSA=
github.com/StackExchange/wmi v1.2.1/go.mod h1:rcmrprowKIVzvc+NUiLncP2uuArMWLCbu9SBzvHz7e8=
github.com/VictoriaMetrics/fastcache v1.6.0 h1:C/3Oi3EiBCqufydp1neRZkqcwmEiuRT9c3fqvvgKm5o=
github.com/VictoriaMetrics/fastcache v1.6.0/go.mod h1:0qHz5QP0GMX4pfmMA/zt5RgfNuXJrTP0zS7DqpHGGTw=
github.com/VictoriaMetrics/metrics v1.23.1 h1:/j8DzeJBxSpL2qSIdqnRFLvQQhbJyJbbEi22yMm7oL0=
github.com/VividCortex/gohistogram v1.0.0 h1:6+hBz+qvs0JOrrNhhmR7lFxo5sINxBCGXrdtl/UvroE=
github.com/VividCortex/gohistogram v1.0.0/go.mod h1:Pf5mBqqDxYaXu3hDrrU+w6nw50o/4+TcAqDqk/vUH7g=
github.com/Workiva/go-datastructures v1.0.53 h1:J6Y/52yX10Xc5JjXmGtWoSSxs3mZnGSaq37xZZh7Yig=
Expand Down Expand Up @@ -281,6 +284,9 @@ github.com/bgentry/go-netrc v0.0.0-20140422174119-9fd32a8b3d3d/go.mod h1:6QX/PXZ
github.com/bgentry/speakeasy v0.1.0/go.mod h1:+zsyZBPWlz7T6j88CTgSN5bM796AkVf0kBD4zp0CCIs=
github.com/bgentry/speakeasy v0.1.1-0.20220910012023-760eaf8b6816 h1:41iFGWnSlI2gVpmOtVTJZNodLdLQLn/KsJqFvXwnd/s=
github.com/bgentry/speakeasy v0.1.1-0.20220910012023-760eaf8b6816/go.mod h1:+zsyZBPWlz7T6j88CTgSN5bM796AkVf0kBD4zp0CCIs=
github.com/bits-and-blooms/bitset v1.2.0/go.mod h1:gIdJ4wp64HaoK2YrL1Q5/N7Y16edYb8uY+O0FJTyyDA=
github.com/bits-and-blooms/bitset v1.2.2 h1:J5gbX05GpMdBjCvQ9MteIg2KKDExr7DrgK+Yc15FvIk=
github.com/bits-and-blooms/bitset v1.2.2/go.mod h1:gIdJ4wp64HaoK2YrL1Q5/N7Y16edYb8uY+O0FJTyyDA=
github.com/bmizerany/pat v0.0.0-20170815010413-6226ea591a40/go.mod h1:8rLXio+WjiTceGBHIoTvn60HIbs7Hm7bcHjyrSqYB9c=
github.com/boltdb/bolt v1.3.1/go.mod h1:clJnj/oiGkjum5o1McbSZDSLxVThjynRyGBgiAx27Ps=
github.com/btcsuite/btcd v0.0.0-20190315201642-aa6e0f35703c/go.mod h1:DrZx5ec/dmnfpw9KyYoQyYo7d0KEvTkk/5M/vbZjAr8=
Expand Down Expand Up @@ -317,6 +323,7 @@ github.com/btcsuite/websocket v0.0.0-20150119174127-31079b680792/go.mod h1:ghJtE
github.com/btcsuite/winsvc v1.0.0/go.mod h1:jsenWakMcC0zFBFurPLEAyrnc/teJEM1O46fmI40EZs=
github.com/bwesterb/go-ristretto v1.2.0/go.mod h1:fUIoIZaG73pV5biE2Blr2xEzDoMj7NFEuV9ekS419A0=
github.com/c-bata/go-prompt v0.2.2/go.mod h1:VzqtzE2ksDBcdln8G7mk2RX9QyGjH+OVqOCSiVIqS34=
github.com/c2h5oh/datasize v0.0.0-20220606134207-859f65c6625b h1:6+ZFm0flnudZzdSE0JxlhR2hKnGPcNB35BjQf4RYQDY=
github.com/casbin/casbin/v2 v2.1.2/go.mod h1:YcPU1XXisHhLzuxH9coDNf2FbKpjGlbCg3n9yuLkIJQ=
github.com/cenkalti/backoff v2.2.1+incompatible h1:tNowT99t7UNflLxfYYSlKYsBpXdEet03Pg2g16Swow4=
github.com/cenkalti/backoff v2.2.1+incompatible/go.mod h1:90ReRw6GdpyfrHakVjL/QHaoyV4aDUVVkXQJJJ3NXXM=
Expand Down Expand Up @@ -840,6 +847,7 @@ github.com/labstack/gommon v0.3.0/go.mod h1:MULnywXg0yavhxWKc+lOruYdAhDwPK9wf0OL
github.com/leanovate/gopter v0.2.9/go.mod h1:U2L/78B+KVFIx2VmW6onHJQzXtFb+p5y3y2Sh+Jxxv8=
github.com/ledgerwatch/erigon-lib v0.0.0-20230210071639-db0e7ed11263 h1:LGEzZvf33Y1NhuP5+jI/ni9l1TFS6oYPDilgy74NusM=
github.com/ledgerwatch/erigon-lib v0.0.0-20230210071639-db0e7ed11263/go.mod h1:OXgMDuUo2lZ3NpH29ZvMYbk+LxFd5ffDl2Z2mGMuY/I=
github.com/ledgerwatch/log/v3 v3.7.0 h1:aFPEZdwZx4jzA3+/Pf8wNDN5tCI0cIolq/kfvgcM+og=
github.com/leodido/go-urn v1.2.0 h1:hpXL4XnriNwQ/ABnpepYM/1vCLWNDfUNts8dX3xTG6Y=
github.com/leodido/go-urn v1.2.0/go.mod h1:+8+nEpDfqqsY+g338gtMEUOtuK+4dEMhiQEgxpxOKII=
github.com/lib/pq v1.0.0/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo=
Expand Down Expand Up @@ -920,6 +928,8 @@ github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9G
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
github.com/modocache/gover v0.0.0-20171022184752-b58185e213c5/go.mod h1:caMODM3PzxT8aQXRPkAt8xlV/e7d7w8GM5g0fa5F0D8=
github.com/mschoch/smat v0.0.0-20160514031455-90eadee771ae/go.mod h1:qAyveg+e4CE+eKJXWVjKXM4ck2QobLqTDytGJbLLhJg=
github.com/mschoch/smat v0.2.0 h1:8imxQsjDm8yFEAVBe7azKmKSgzSkZXDuKkSq9374khM=
github.com/mschoch/smat v0.2.0/go.mod h1:kc9mz7DoBKqDyiRL7VZN8KvXQMWeTaVnttLRXOlotKw=
github.com/mtibben/percent v0.2.1 h1:5gssi8Nqo8QU/r2pynCm+hBQHpkB/uNK7BJCFogWdzs=
github.com/mtibben/percent v0.2.1/go.mod h1:KG9uO+SZkUp+VkRHsCdYQV3XSZrrSpR3O9ibNBTZrns=
github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
Expand Down Expand Up @@ -976,6 +986,7 @@ github.com/pascaldekloe/goe v0.0.0-20180627143212-57f6aae5913c/go.mod h1:lzWF7FI
github.com/pascaldekloe/goe v0.1.0 h1:cBOtyMzM9HTpWjXfbbunk26uA6nG3a8n06Wieeh0MwY=
github.com/pascaldekloe/goe v0.1.0/go.mod h1:lzWF7FIEvWOWxwDKqyGYQf6ZUaNfKdP144TG7ZOy1lc=
github.com/paulbellamy/ratecounter v0.2.0/go.mod h1:Hfx1hDpSGoqxkVVpBi/IlYD7kChlfo5C6hzIHwPqfFE=
github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 h1:onHthvaw9LFnH4t2DcNVpwGmV9E1BkGknEliJkfwQj0=
github.com/pborman/uuid v1.2.0/go.mod h1:X/NO0urCmaxf9VXbdlT7C2Yzkj2IKimNn4k+gtPdI/k=
github.com/pelletier/go-toml v1.2.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/94hg7ilaic=
github.com/pelletier/go-toml v1.9.5 h1:4yBQzkHv+7BHq2PQUZF3Mx0IYxG7LsP222s7Agd3ve8=
Expand Down Expand Up @@ -1157,6 +1168,7 @@ github.com/tklauser/numcpus v0.2.2/go.mod h1:x3qojaO3uyYt0i56EW/VUYs7uBvdl2fkfZF
github.com/tklauser/numcpus v0.4.0 h1:E53Dm1HjH1/R2/aoCtXtPgzmElmn51aOkhCFSuZq//o=
github.com/tklauser/numcpus v0.4.0/go.mod h1:1+UI3pD8NW14VMwdgJNJ1ESk2UnwhAnz5hMwiKKqXCQ=
github.com/tmc/grpc-websocket-proxy v0.0.0-20170815181823-89b8d40f7ca8/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
github.com/torquem-ch/mdbx-go v0.27.5 h1:bbhXQGFCmoxbRDXKYEJwxSOOTeBKwoD4pFBUpK9+V1g=
github.com/ttacon/chalk v0.0.0-20160626202418-22c06c80ed31/go.mod h1:onvgF043R+lC5RZ8IT9rBXDaEDnpnw/Cl+HFiw+v/7Q=
github.com/tv42/httpunix v0.0.0-20150427012821-b75d8614f926/go.mod h1:9ESjWnEqriFuLhtthL60Sar/7RFoluCcXsuvEwTV5KM=
github.com/tyler-smith/go-bip39 v1.0.1-0.20181017060643-dbb3b84ba2ef/go.mod h1:sJ5fKU0s6JVwZjjcUEX2zFOnvq0ASQ2K9Zr6cf67kNs=
Expand All @@ -1174,8 +1186,10 @@ github.com/urfave/cli v1.20.0/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijb
github.com/urfave/cli v1.22.1/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0=
github.com/urfave/cli/v2 v2.3.0/go.mod h1:LJmUH05zAU44vOAcrfzZQKsZbVcdbOG8rtL3/XcUArI=
github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyCJ6HpOuEn7z0Csc=
github.com/valyala/fastrand v1.1.0 h1:f+5HkLW4rsgzdNoleUOB69hyT9IlD2ZQh9GyDMfb5G8=
github.com/valyala/fasttemplate v1.0.1/go.mod h1:UQGH1tvbgY+Nz5t2n7tXsz52dQxojPUpymEIMZ47gx8=
github.com/valyala/fasttemplate v1.2.1/go.mod h1:KHLXt3tVN2HBp8eijSv/kGJopbvo7S+qRAEEKiv+SiQ=
github.com/valyala/histogram v1.2.0 h1:wyYGAZZt3CpwUiIb9AU/Zbllg1llXyrtApRS815OLoQ=
github.com/vmihailenco/msgpack/v5 v5.3.5/go.mod h1:7xyJ9e+0+9SaZT0Wt1RGleJXzli6Q/V5KbhBonMG9jc=
github.com/vmihailenco/tagparser/v2 v2.0.0/go.mod h1:Wri+At7QHww0WTrCBeu4J6bNtoV6mEfg5OIWRZA9qds=
github.com/willf/bitset v1.1.3/go.mod h1:RjeCKbqT1RxIR/KWY6phxZiaY1IyutSBfGjNPySAYV4=
Expand Down Expand Up @@ -1213,6 +1227,7 @@ go.opentelemetry.io/proto/otlp v0.7.0/go.mod h1:PqfVotwruBrMGOCsRd/89rSnXhoiJIqe
go.uber.org/atomic v1.3.2/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE=
go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE=
go.uber.org/atomic v1.5.0/go.mod h1:sABNBOSYdrvTF6hTgEIbc7YasKWGhgEQZyfxyTvoXHQ=
go.uber.org/atomic v1.10.0 h1:9qC72Qh0+3MqyJbAn8YU5xVq1frD8bn3JtD2oXtafVQ=
go.uber.org/multierr v1.1.0/go.mod h1:wR5kodmAFQ0UK8QlbwjlSNy0Z68gJhDJUG5sjR94q/0=
go.uber.org/multierr v1.3.0/go.mod h1:VgVr7evmIr6uPjLBxg28wmKNXyqE9akIJ5XnfpiKl+4=
go.uber.org/tools v0.0.0-20190618225709-2cfd321de3ee/go.mod h1:vJERXedbb3MVM5f9Ejo0C68/HhF8uaILCdgjnY+goOA=
Expand Down
21 changes: 9 additions & 12 deletions gomod2nix.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ schema = 3
[mod."github.com/ChainSafe/go-schnorrkel"]
version = "v0.0.0-20200405005733-88cbf1b4c40d"
hash = "sha256-i8RXZemJGlSjBT35oPm0SawFiBoIU5Pkq5xp4n/rzCY="
[mod."github.com/RoaringBitmap/roaring"]
version = "v1.2.2"
hash = "sha256-fU7+98nt+xznngUBczAVb0eZosu1qGbFlXROKQbCOmE="
[mod."github.com/StackExchange/wmi"]
version = "v1.2.1"
hash = "sha256-1BoEeWAWyebH+1mMuyPhWZut8nWHb6r73MgcqlGuUEY="
Expand Down Expand Up @@ -64,6 +67,9 @@ schema = 3
[mod."github.com/bgentry/speakeasy"]
version = "v0.1.1-0.20220910012023-760eaf8b6816"
hash = "sha256-Tx3sPuhsoVwrCfJdIwf4ipn7pD92OQNYvpCxl1Z9Wt0="
[mod."github.com/bits-and-blooms/bitset"]
version = "v1.2.2"
hash = "sha256-pKeW8JiAygNzSWozURhmVmbUj0VvOjAxjW1xTCdKL1w="
[mod."github.com/btcsuite/btcd"]
version = "v0.23.4"
hash = "sha256-xP7TLBdOoUIjg5Q3MOjbT5P9tkCWjsd4bWgZLp539Wo="
Expand Down Expand Up @@ -337,18 +343,6 @@ schema = 3
version = "v1.7.15-0.20230222024938-b61261a9193b"
hash = "sha256-Hry5mpO8WqCuYZ0zCnOt6kopDGprc7/nI318A2D+Kk0="
replaced = "github.com/linxGnu/grocksdb"
[mod."github.com/lispad/go-generics-tools"]
version = "v1.1.0"
hash = "sha256-MPeqIrDVGv6d/mBK4o48eaRc8Qga5l9i1tpUyrgMoII="
[mod."github.com/lucasjones/reggen"]
version = "v0.0.0-20180717132126-cdb49ff09d77"
hash = "sha256-gX55KvGzJkyghKIhmQFnwCI9mXYt/FtDgI/sQ4xNfrE="
[mod."github.com/lufeee/execinquery"]
version = "v1.2.1"
hash = "sha256-Hg+/0StXgoflSxwiw96IYhYybYy26o1QA66a5pMsswo="
[mod."github.com/lyft/protoc-gen-validate"]
version = "v0.0.13"
hash = "sha256-JhFMmEaP1amtJJBLWFVqjjHeHuAHRP0qwLMMFX2b3FM="
[mod."github.com/magiconair/properties"]
version = "v1.8.6"
hash = "sha256-xToSfpuePctkTdhJtsuKIEkXwfMZbnkFT98ahIfd4wY="
Expand Down Expand Up @@ -382,6 +376,9 @@ schema = 3
[mod."github.com/mitchellh/mapstructure"]
version = "v1.5.0"
hash = "sha256-ztVhGQXs67MF8UadVvG72G3ly0ypQW0IRDdOOkjYwoE="
[mod."github.com/mschoch/smat"]
version = "v0.2.0"
hash = "sha256-DZvUJXjIcta3U+zxzgU3wpoGn/V4lpBY7Xme8aQUi+E="
[mod."github.com/mtibben/percent"]
version = "v0.2.1"
hash = "sha256-Zj1lpCP6mKQ0UUTMs2By4LC414ou+iJzKkK+eBHfEcc="
Expand Down
51 changes: 31 additions & 20 deletions memiavl/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@
* 27 Jan 2023:
* Update metadata file format
* Encode key length with 4 bytes instead of 2.
* 21 Feb 2023:
* Reduce node size (without hash) from 32bytes to 16bytes.
* Append offset table to the end of keys/values file, try elias-fano coding for the values one.


## The Journey

Expand Down Expand Up @@ -65,48 +69,53 @@ IAVL snapshot is composed by four files:
```
magic: uint32
format: uint32
version: uint64
version: uint32
root node index: uint32
```

- `nodes`, array of fixed size(64bytes) nodes, the node format is like this:
- `nodes`, array of fixed size(16+32bytes) nodes, the node format is like this:

```
height : uint32 // padded to 4bytes
version : uint32
size : uint32
key_node : uint32
hash : [32]byte
```
The node has fixed length, can be indexed directly. The nodes reference each other with the node index, nodes are written in post-order, so the root node is always placed at the end.

For branch node, the `key_node` field reference the minimal leaf node in the right branch for branch node, for leaf node, it's the leaf index instead, which can be used to find key and value in `keys` and `values` file.

The branch node don't need extra left/right children references, those are derived on the fly based on property of post-order traversal:

```
height : uint8 // padded to 4bytes
version : uint32
size : uint64
key : uint64 // offset in keys file
left : uint32 // inner node only
right : uint32 // inner node only
value : uint64 offset // offset in values file, leaf node only
hash : [32]byte
right child index = self index - 1
left child index = key_node - 1
```
The node has fixed length, can be indexed directly. The nodes reference each other with the index, nodes are written in post-order, so the root node is always placed at the end.

Some integers are using `uint32`, should be enough in forseeable future, but could be changed to `uint64` to be safer.

The implementation will read the mmap-ed content in a zero-copy way, won't use extra node cache, it will only rely on the OS page cache.

- `keys`, sequence of length prefixed leaf node keys, ordered and no duplication.
- `keys`, sequence of leaf node keys, ordered and no duplication, the offsets are appended to the end of the file, user can look up the key offset by leaf node index.

```
size: uint32
payload
*repeat*
key offset: uint32
*repeat*
offset: uint64 // begin offset of the offsets table
```

Key size is encoded in `uint32`, so the maximum key length supported is `1<<32-1`, around 4G.

- `values`, sequence of length prefixed leaf node values.
- `values`, sequence of leaf node values, the offsets are encoded with elias-fano coding and appended to the end of the file, user can look up the key offset by leaf node index.

```
size: uint32
payload
*repeat*
offsets table encoded with elias-fano coding
offset: uint64 // begin offset of the offsets table
```

Value size is encoded in `uint32`, so maximum value length supported is `1<<32-1`, around 4G.

#### Compression

The items in snapshot reference with each other by file offsets, we can apply some block compression techniques to compress keys and values files while maintain random accessbility by uncompressed file offset, for example zstd's experimental seekable format[^1].
Expand All @@ -115,4 +124,6 @@ The items in snapshot reference with each other by file offsets, we can apply so

[VersionDB](../README.md) is to support query and iterating historical versions of key-values pairs, currently implemented with rocksdb's experimental user-defined timestamp feature, support query and iterate key-value pairs by version, it's an alternative way to support grpc query service, and much more compact than IAVL trees, similar in size with the compressed change set files.

[^1]: https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md
After versiondb is fully integrated, IAVL tree don't need to serve queries at all, it don't need to store the values at all, just store the value hashes would be enough.

[^1]: https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md
66 changes: 66 additions & 0 deletions memiavl/benchmark_offsets_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
package memiavl

import (
"math/rand"
"testing"

"github.com/RoaringBitmap/roaring/roaring64"
"github.com/ledgerwatch/erigon-lib/recsplit/eliasfano32"
"github.com/stretchr/testify/require"
)

const TestN = 1000000

func genTestData(n int) []uint64 {
rand.Seed(0)
result := make([]uint64, n)
var offset uint64
for i := 0; i < n; i++ {
result[i] = offset
offset += 32 + uint64(rand.Int63n(15))
}
return result
}

func BenchmarkOffsetsRoaring(b *testing.B) {
data := genTestData(1000000)
builder := roaring64.New()
for _, n := range data {
builder.Add(n)
}
builder.RunOptimize()
bz, err := builder.ToBytes()
require.NoError(b, err)
// b.Logf("RoaringBitmap size %d", len(bz))

bm := roaring64.New()
require.NoError(b, bm.UnmarshalBinary(bz))
b.ResetTimer()
for i := 0; i < b.N; i++ {
// need two offsets to construct the slice
idx := uint64(i % (len(data) - 1))
_, err := bm.Select(idx)
require.NoError(b, err)
_, err = bm.Select(idx + 1)
require.NoError(b, err)
}
}

func BenchmarkOffsetsEliasfano32(b *testing.B) {
data := genTestData(1000000)
builder := eliasfano32.NewEliasFano(uint64(len(data)), data[len(data)-1])
for _, n := range data {
builder.AddOffset(n)
}
builder.Build()
bz := builder.AppendBytes(nil)
// b.Logf("EliasFano size %d", len(bz))

ef, _ := eliasfano32.ReadEliasFano(bz)
b.ResetTimer()
for i := 0; i < b.N; i++ {
// need two offsets to construct the slice
offset1, offset2 := ef.Get2(uint64(i % (len(data) - 1)))
_ = offset2 - offset1
}
}
Loading

0 comments on commit 9b03c6c

Please sign in to comment.