Skip to content

Extend tikv load graph #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 445 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
445 commits
Select commit Hold shift + click to select a range
263e105
Merge branch 'latest' into property_filter
zhengyi-yang Apr 24, 2019
b082b5a
remove ldbc
zhengyi-yang Apr 24, 2019
c186871
Fixed all the warnings
KongzhangHao Apr 24, 2019
9d10034
Merged conflicts
KongzhangHao Apr 24, 2019
a25cdd9
Added functions to insert Vec<u8> as property
KongzhangHao Apr 24, 2019
07a4cb1
skip unkonwn fields when reading csv
zhengyi-yang Apr 24, 2019
21cb0d5
Merge branch 'latest' into property_filter
zhengyi-yang Apr 24, 2019
f104be0
rustfmt
zhengyi-yang Apr 24, 2019
9f95552
Added tests for insertion raw
KongzhangHao Apr 24, 2019
47cd149
Merge branch 'property_filter' of github.com:UNSW-database/rust_graph…
KongzhangHao Apr 24, 2019
a52e720
Added tests for raw insertion with cached property
KongzhangHao Apr 24, 2019
cd0e783
Simplify the property insertion functions
KongzhangHao Apr 24, 2019
caab670
use AsPef<Path> in SledProp
zhengyi-yang Apr 25, 2019
e78fa0c
Added open function for sled db with readonly option
xiamaomaoyu Apr 27, 2019
0690f95
Added compression option for opening sled
xiamaomaoyu Apr 27, 2019
959ddb5
Added snapshot flag to disable it
xiamaomaoyu Apr 27, 2019
98e4908
add compression and no_logs to sled
zhengyi-yang Apr 27, 2019
9fd8c02
remove no_log
zhengyi-yang Apr 27, 2019
66bf9a4
update sled property
zhengyi-yang Apr 27, 2019
e22b3ea
Added filter error displayment
KongzhangHao May 1, 2019
89d4ddb
Temproraily delete print statements for debugging
KongzhangHao May 2, 2019
f99485e
Optimised cypher parser
KongzhangHao May 2, 2019
f6eaf7e
Added print statements for debugging
KongzhangHao May 2, 2019
a769e49
Added unwrap() to trigger error for debugging
KongzhangHao May 2, 2019
c36698d
Added error handling to expression.get_result
KongzhangHao May 5, 2019
041515c
Added property cache for prefech and get values
KongzhangHao May 5, 2019
ba87c84
Fixed the test cases
KongzhangHao May 5, 2019
36d6ad5
Added dependency temp-dir
KongzhangHao May 5, 2019
e881778
Added disabled flag for property cache
KongzhangHao May 5, 2019
92db8ae
Deleted main function
KongzhangHao May 5, 2019
0593a98
Removed disabled flag from property cache
KongzhangHao May 5, 2019
6e9b468
replace HashMap with BTreeMap
zhengyi-yang May 6, 2019
3bc1870
Merge branch 'property_filter' of github.com:UNSW-database/rust_graph…
zhengyi-yang May 6, 2019
4c8af50
Removed print statements
KongzhangHao May 6, 2019
25ab5a3
Merge branch 'property_filter' of github.com:UNSW-database/rust_graph…
KongzhangHao May 6, 2019
74e5ddc
Added expresssion cache to store parsed expressions
KongzhangHao May 6, 2019
0613156
Changed getter of expression cache to option
KongzhangHao May 6, 2019
54880e6
Added Clone to expression cache
KongzhangHao May 6, 2019
14a7cf5
Changed from hashbrown to std::collection::hashmap
KongzhangHao May 6, 2019
4249323
Added sync and send for expression cache
KongzhangHao May 6, 2019
5f12602
Added rocksdb property graph
KongzhangHao May 6, 2019
4844b6d
Replaced panic with property error
KongzhangHao May 6, 2019
afa74a9
rustfmt
zhengyi-yang May 7, 2019
a7d9894
move tempdir to dev-dependencies
zhengyi-yang May 7, 2019
ddbb26e
Added u64 type operations for expression
KongzhangHao May 8, 2019
63fc6e2
parse string to json in reader
zhengyi-yang May 8, 2019
b7223a7
Empty expression for non-existing edges
KongzhangHao May 9, 2019
67edf24
Merge branch 'property_filter' of github.com:UNSW-database/rust_graph…
KongzhangHao May 9, 2019
6afcb49
Added debug info for get_edge_exp
KongzhangHao May 9, 2019
8039862
Added unfinished scan functions
KongzhangHao May 9, 2019
1d2ba2b
rustfmt
zhengyi-yang May 9, 2019
4bf08a0
Format code
KongzhangHao May 9, 2019
6c0d8c8
Remove useless file
KongzhangHao May 9, 2019
42b8314
Optimise imports
KongzhangHao May 10, 2019
4ad06ef
Changed trace to debug
KongzhangHao May 13, 2019
4aaa79a
remove Sled
zhengyi-yang May 13, 2019
9c6e857
Added stopping sign of prefetching
KongzhangHao May 15, 2019
3909899
Merge branch 'property_filter' of github.com:UNSW-database/rust_graph…
KongzhangHao May 15, 2019
6b5b96f
Added file path checking when openning db
KongzhangHao May 16, 2019
1bc5e13
Removed stopping of prefetching
KongzhangHao May 16, 2019
e63c9df
Return true for filters to test performance
KongzhangHao May 16, 2019
1f18c00
Filter returns true without getting property
KongzhangHao May 16, 2019
298ef48
Changed get by value to get by reference
KongzhangHao May 16, 2019
b3d38b1
Removed unnecessary reference
KongzhangHao May 16, 2019
ae50be4
Changed expression to be passed by reference
KongzhangHao May 16, 2019
6a774c8
Changed get_value to return Cow
KongzhangHao May 16, 2019
e7c2ff4
Changed filter to return true
KongzhangHao May 16, 2019
82a138f
Enabled get property
KongzhangHao May 16, 2019
6af7bc4
Return true when filtering
KongzhangHao May 17, 2019
f803ed1
Return true from filtering
KongzhangHao May 17, 2019
17e7f2e
Removed returns in filtering
KongzhangHao May 17, 2019
57c0067
Added send and sync to property cache
KongzhangHao May 17, 2019
25b9006
update eq
zhengyi-yang May 17, 2019
1da0db2
Property cache with version with index map
KongzhangHao May 17, 2019
f7a4b5e
Property cache with version of continious ids
KongzhangHao May 17, 2019
6f888fb
Property cache with version of purely hashmaps
KongzhangHao May 17, 2019
7955665
Merge branch 'property_filter' of github.com:UNSW-database/rust_graph…
KongzhangHao May 17, 2019
c402b70
Changed two hashmaps to vectors
KongzhangHao May 17, 2019
e813789
Added switch for node and edge disable
KongzhangHao May 17, 2019
33dfd0f
Added or to the rocksdb new
KongzhangHao May 17, 2019
d2d1791
Added conditions to prevent prefetch
KongzhangHao May 17, 2019
9898f5b
Stop initialisation when cahce disabled
KongzhangHao May 18, 2019
f84f907
Added disabling of expressions
KongzhangHao May 18, 2019
727ce9c
Get exp always returns a some value
KongzhangHao May 18, 2019
779d2d4
Moved disable determination to filtering function
KongzhangHao May 18, 2019
bd9b1e0
rustfmt
zhengyi-yang May 20, 2019
ce04c20
use Itertools::dedup in graph_vec
zhengyi-yang May 20, 2019
90cd475
add FakeProperty
zhengyi-yang May 20, 2019
da576a7
Added tests and remove warnings
KongzhangHao May 21, 2019
3ecf727
Add route function for property cache
KongzhangHao May 21, 2019
3930e34
Cargo fmt
KongzhangHao May 21, 2019
5c4c77b
Merged
KongzhangHao May 21, 2019
a79a781
Fix programming glitches.
longbinlai May 21, 2019
a2a13a0
Fixed bug in property cache
KongzhangHao May 21, 2019
1dbb7a3
Fixed life time bug with cache filtering
KongzhangHao May 21, 2019
43243c3
Increased parallelism for rocksdb
KongzhangHao May 21, 2019
7e4c962
Revert "Moved disable determination to filtering function"
KongzhangHao May 21, 2019
f90f698
Backtrace
KongzhangHao May 21, 2019
a8a51bb
Revert backtrace
KongzhangHao May 21, 2019
7a1b04e
Remove unnecessary rocksdb options.
longbinlai May 21, 2019
e66b319
Remove `route_fn` from PropertyCache
longbinlai May 23, 2019
9973d6c
Enabled exp cache get to handle undirected edge
KongzhangHao Jun 7, 2019
e3750f0
Merge from shared
KongzhangHao Jun 7, 2019
c9b9c5a
Merge conflicts
KongzhangHao Jun 7, 2019
efe1af6
Format code
KongzhangHao Jun 7, 2019
8cd98ab
Added instant caching for hash node cache
KongzhangHao Jun 11, 2019
e8eb153
Applied instant caching for edge property
KongzhangHao Jun 12, 2019
aade86d
Cargo fmt
KongzhangHao Jun 12, 2019
0d879b4
Fixed unit tests
KongzhangHao Jun 12, 2019
4e42cd1
Format code
KongzhangHao Jun 14, 2019
9778d90
Added crate lru
KongzhangHao Jun 14, 2019
eea6c59
Removed warnings
KongzhangHao Jun 14, 2019
2ebaac8
Removed cypher_tree.txt
KongzhangHao Jun 14, 2019
5b2e9d5
Merge pull request #14 from UNSW-database/property-filter-instant-cache
longbinlai Jun 14, 2019
ac70974
Lru cache crate moved to local project
KongzhangHao Jun 15, 2019
ed53115
Added LruNodeCache and LruEdgeCache
KongzhangHao Jun 15, 2019
5db534c
Changed default property cache to lru
KongzhangHao Jun 15, 2019
786f253
Removed set function from cache inerface
KongzhangHao Jun 17, 2019
eeae76a
Disabled rocksdb cache
KongzhangHao Jun 17, 2019
2cc9024
Disabled os buffer
KongzhangHao Jun 17, 2019
0eacb95
Minised the memory use of lru
KongzhangHao Jun 18, 2019
dcaa699
Format code
KongzhangHao Jun 18, 2019
e004098
Allow os buffer
KongzhangHao Jun 18, 2019
4abffd0
Enabled block cache
KongzhangHao Jun 18, 2019
7ee02cc
Creating the cache with given capacity directly
KongzhangHao Jun 18, 2019
8fc4e18
Added minimised edge cache and format code
KongzhangHao Jun 18, 2019
52582fd
Added resize for lru node and edge cache
KongzhangHao Jun 18, 2019
509914b
Cargo fmt
KongzhangHao Jun 18, 2019
34406d8
Merged conflicts
KongzhangHao Jun 18, 2019
e1d0ba6
Merge pull request #15 from UNSW-database/property-filter-minimised-l…
KongzhangHao Jun 19, 2019
1850ea0
Added print for debugging node cache
KongzhangHao Jun 26, 2019
bcd5186
Added lru node cache printing for debuging
KongzhangHao Jun 26, 2019
d47b80d
Added printing for lru capacity for debugging
KongzhangHao Jun 26, 2019
196f3a3
Added property error for zero capacity lru cache
KongzhangHao Jun 26, 2019
e6022dc
Added print statement for debugging
KongzhangHao Jun 26, 2019
d45aadf
Remove print statement for debugging
KongzhangHao Jun 26, 2019
8cdbd21
Added result blueprint parser
KongzhangHao Jul 1, 2019
4e94411
Add printing for debugging
KongzhangHao Jul 1, 2019
3791efd
Added print line
KongzhangHao Jul 1, 2019
bbeccf1
Added test for result parser
KongzhangHao Jul 1, 2019
5557480
Print out statement found
KongzhangHao Jul 1, 2019
6a2f269
Print out the result cypher line
KongzhangHao Jul 1, 2019
e3c2de6
Removed warnings
KongzhangHao Jul 1, 2019
3bbdeda
Cargo fmt
KongzhangHao Jul 1, 2019
4d7e1f1
Stop property parsing when meets RETURN
KongzhangHao Jul 2, 2019
9f17e8c
Added termination on return property
KongzhangHao Jul 2, 2019
65d1929
Fix bug in parser property
KongzhangHao Jul 2, 2019
154efe0
Remove useless return checking
KongzhangHao Jul 2, 2019
7758b44
Added error message for node not found
KongzhangHao Jul 4, 2019
e4e579d
Fixed bug with error displaying
KongzhangHao Jul 4, 2019
a0cd761
Added debug info for candidate vars
KongzhangHao Jul 4, 2019
3c4d185
Filtering is true if expression is empty
KongzhangHao Jul 4, 2019
e5f303d
Print out internal json value
KongzhangHao Jul 5, 2019
40f385e
Remove printing node value
KongzhangHao Jul 5, 2019
0953b10
Added case for * and count(*)
KongzhangHao Jul 7, 2019
3ed0675
Fixed bug with displaying star
KongzhangHao Jul 7, 2019
74aa3c1
Added print for debugging
KongzhangHao Jul 7, 2019
eb5f805
Fixed bug with getting largest node id
KongzhangHao Jul 7, 2019
58ce06d
Cargo fmt
KongzhangHao Jul 8, 2019
5e0ea5c
Removed filtering rust graph lib
KongzhangHao Jul 16, 2019
6d9a0d7
format code
zhengyi-yang Jul 17, 2019
5cac82b
feat:csv_reader support on hdfs
yuchen-ecnu Jul 22, 2019
1c6d4e2
fix:HDFSReader&&ignore test&&README.md
yuchen-ecnu Jul 23, 2019
b3bc5ae
fix:Readme && trait name && remove useless code&& fixing readme
yuchen-ecnu Jul 24, 2019
a6e4e85
fix:README.md
yuchen-ecnu Jul 24, 2019
4f53ff2
Update README.md
yuchen-ecnu Jul 24, 2019
9f5fafc
add and use to update to edition 2018 for supporting
zhengminlai Jul 26, 2019
8d8e488
move hdfs dependency to a separate repo
zhengyi-yang Jul 29, 2019
06dcda0
partial commit
zhengminlai Jul 29, 2019
4e68bae
tikv interfaces
zhengminlai Jul 29, 2019
25fcf4c
with no rocksdb dependency version(for testing on ecnu cluster)
zhengminlai Jul 29, 2019
67fe3d1
put back rocksdb
zhengminlai Jul 29, 2019
1740649
Finish tikv property
zhengminlai Jul 29, 2019
ce10e38
fix a bug
zhengminlai Jul 29, 2019
4f6c344
cargo fmt
zhengminlai Jul 29, 2019
7756bd7
add copyright
zhengminlai Jul 29, 2019
6d84fc6
small fix and add readme about how to use tikv
Jul 30, 2019
b21f6a0
small fix
Jul 30, 2019
f534478
fix:extract iterators and implemented for csv+hdfs
yuchen-ecnu Jul 30, 2019
407867b
add bench tikv
zhengminlai Jul 31, 2019
d6b4823
format code
zhengyi-yang Aug 1, 2019
255e152
Merge pull request #16 from UNSW-database/csv_reader_hdfs_support
zhengyi-yang Aug 1, 2019
9bbd229
Merge pull request #17 from UNSW-database/property_filter
zhengyi-yang Aug 1, 2019
83a2bd9
Merge pull request #18 from UNSW-database/hdfs-support
zhengyi-yang Aug 1, 2019
4d6eeb9
set hdfs as a feature
zhengyi-yang Aug 1, 2019
6cca291
remove lifetime
zhengyi-yang Aug 1, 2019
f351902
edit ReadGraph trait
zhengyi-yang Aug 1, 2019
2c04ff9
edit ReadGraph trait
zhengyi-yang Aug 1, 2019
7991c59
format code
zhengyi-yang Aug 1, 2019
ffc25ba
add ReadGraphTo trait
zhengyi-yang Aug 1, 2019
2d684e9
add Default to Reader
zhengyi-yang Aug 1, 2019
db2769e
fix typo
zhengyi-yang Aug 1, 2019
ef4f7c9
fix lifetime
zhengyi-yang Aug 1, 2019
2749b03
Revert "fix lifetime"
zhengyi-yang Aug 1, 2019
dab7ede
Merge branch 'hdfs-support' into dev
zhengyi-yang Aug 2, 2019
40593d8
add benchmark and update how-to-use-tikv.md
zhengminlai Aug 5, 2019
911e978
cargo fmt
zhengminlai Aug 5, 2019
5c07b5b
update readme
shl5133 Aug 5, 2019
235b4ca
update bench-tikv-rocksdb.mc
shl5133 Aug 5, 2019
2a1bbba
small fix
zhengminlai Aug 5, 2019
8efc27d
fix some problems in tikv_rocksdb_test
zhengminlai Aug 5, 2019
9a676a2
update readme
zhengminlai Aug 5, 2019
0ddb92d
batch get
zhengminlai Aug 6, 2019
2db73bc
add benchmark of tikv's batch_get operation
zhengminlai Aug 7, 2019
971ace0
small fix
zhengminlai Aug 7, 2019
5b93b26
add tikv's batch_get benchmark result on a single machine
zhengminlai Aug 7, 2019
b43e264
small fix
zhengminlai Aug 7, 2019
c2762d0
update CsvReader
zhengyi-yang Aug 7, 2019
fad24ad
add EmptyReader
zhengyi-yang Aug 7, 2019
1e8d269
add debug logging
zhengyi-yang Aug 7, 2019
f1e0948
remove Empty Reader
zhengyi-yang Aug 7, 2019
db7a694
fix:Allow using directory as data path in HDFSReader::new()
yuchen-ecnu Aug 8, 2019
9df7ebe
fix:Support reading directory in hdfsreader
yuchen-ecnu Aug 8, 2019
9e19e3b
Merge pull request #20 from UNSW-database/fix_reader_directory
yuchen-ecnu Aug 8, 2019
d17087a
update tikv interfaces
zhengminlai Aug 13, 2019
593dd9e
rm some dependencies and update readme
zhengminlai Aug 14, 2019
c446887
feat:Loading graph to tikv server through
yuchen-ecnu Aug 15, 2019
e31cdf2
feat:Loading graph to tikv server through
yuchen-ecnu Aug 15, 2019
161ba41
Merge branch 'tikv_load_graph' of github.com:UNSW-database/rust_graph…
yuchen-ecnu Aug 15, 2019
9c16b99
fix:add some comments
yuchen-ecnu Aug 15, 2019
4bd5699
fix:rustfmt
yuchen-ecnu Aug 15, 2019
ca96e27
fix:json->cbor & batch_size & merge dev
yuchen-ecnu Aug 16, 2019
a7596fa
fix:using cbor instead of bincode
yuchen-ecnu Aug 19, 2019
8830bac
update read trait for simple multi-threading support
zhengyi-yang Aug 20, 2019
971c001
fix bug of collect
zhengyi-yang Aug 20, 2019
b37293b
fix error of lifetime
zhengyi-yang Aug 20, 2019
35d248c
fix:hdfs file iterator
yuchen-ecnu Aug 20, 2019
d799a38
remove 'a
zhengyi-yang Aug 20, 2019
201a213
Merge pull request #22 from UNSW-database/update_read_trait
zhengyi-yang Aug 20, 2019
190a9d5
fix:graph_loader && initial batch_write test
yuchen-ecnu Aug 22, 2019
f8120c6
update logging for readers
zhengyi-yang Aug 22, 2019
58a84f7
fix variable names
zhengyi-yang Aug 22, 2019
67b0455
fix:parallel loading
yuchen-ecnu Sep 3, 2019
e82c4fd
fix:using tokio instead of blockon
yuchen-ecnu Sep 3, 2019
e9e4ca2
fix:Sharing tokio::Runtime between thread
yuchen-ecnu Sep 5, 2019
0a634af
fix:update graph load time
yuchen-ecnu Sep 6, 2019
5696f47
fix:using runtime in tikv_property && moving benchers to ./bench && t…
yuchen-ecnu Sep 9, 2019
8bc4929
12.07 added
Katherine2013 Dec 18, 2019
4c93662
12.08 added
Katherine2013 Dec 18, 2019
fd2e5ca
tmp version
Katherine2013 Dec 24, 2019
0a25804
for test only
Katherine2013 Dec 30, 2019
658a2c9
label related tests all passed
Katherine2013 Jan 13, 2020
be8b8c8
for debug only
Katherine2013 Mar 5, 2020
8263329
fix:setmap
yuchen-ecnu Mar 6, 2020
a88b96c
prefix_scan before test
Katherine2013 Mar 8, 2020
891fe09
why label can't match?
Katherine2013 Mar 13, 2020
377f2a0
Issue: Scan limit doesn't work
Katherine2013 Mar 17, 2020
3e983d9
UPDATE: update tikv_client to latest version
yuchen-ecnu Mar 28, 2020
5999a18
Add: a prefix scan function to get neighbors
Katherine2013 Mar 29, 2020
c966b15
Add: a data loader from .csv file to tikv
Katherine2013 Apr 14, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 29 additions & 7 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@ name = "rust_graph"
version = "0.1.9"
authors = ["Zhengyi Yang <zhengyi.yang@outlook.com>"]
autoexamples = true
edition="2018"

[features]
default = []
usize_id = []
ldbc = ["regex"]

[dependencies]
indexmap = { version = "1.0.2",features = ["serde-1"] }
Expand All @@ -20,17 +20,39 @@ bincode = "1.0.1"
log = "0.4"
csv = "1"
counter = "0.4.3"
regex = {version = "1", optional = true }
fnv = "1.0.6"
serde_cbor = "0.9.0"
fixedbitset = "0.1.9"
hashbrown = {version = "0.2.0", features = ["serde"] }
rayon = "1.0.3"
serde_json = "1.0.39"
fxhash = "0.2.1"
rocksdb = "0.12.2"
lru = "0.1.15"
scoped_threadpool = "0.1.9"
# tikv-client = { git = "https://github.com/tikv/client-rust.git" }
tikv-client = { git = "https://github.com/tz70s/client-rust.git", branch = "fix-raw-scan-range" }
futures-preview = { version = "0.3.0-alpha.17", features = ["compat"] }
walkdir = "2.0.0"
tokio = "=0.2.13"
#protobuf = { version = "2.0"}

#
#[dependencies.hdfs]
#git="https://github.com/UNSW-database/hdfs-rs.git"
#default-features = false
#optional = true

[dev-dependencies]
tempfile = "3.0.4"
pbr = "1.0.1"
clap = "2.32.0"
criterion = "0.2"
tempdir = "0.3.7"

[[example]]
name = "ldbc_to_graphmap"
required-features = ["ldbc"]
#[patch.crates-io]
#hdfs = {}

[build-dependencies]
protoc-rust = "2.0"

[patch.crates-io]
raft-proto = { git = "https://github.com/tikv/raft-rs", rev = "e624c1d48460940a40d8aa69b5329460d9af87dd" }
36 changes: 18 additions & 18 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
Copyright (c) 2018 UNSW Sydney, Data and Knowledge Group.

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Copyright (c) 2018 UNSW Sydney, Data and Knowledge Group.
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
117 changes: 116 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,118 @@
# rust\_graph\_lib

A graph libary written in Rust.
A graph libary written in Rust. Note that You need to install cmake, g++, clang, golang on your machine first and use Rust nightly to build this project.

## Setup for hdfs reading support

### 0. Explanations for `build` and `running` stage in `hdfs_lib`
The function for reading files from `hdfs` is based on a library [`hdfs-rs`](https://github.com/hyunsik/hdfs-rs). Because the library is not update for a few years, so I fixed some bugs in the source code and stored in `src/io/hdfs//hdfs_lib`. The project is regard it as a local crate. (Just as `Cago.toml` shows: `hdfs={path="src//io//hdfs//hdfs_lib"}`).
* In the library, we were calling`libhdfs C APIs`[(docs here)](http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/LibHdfs.html) (supported by hadoop) to implement functions. And encapsulate the `libhdfs C APIs` in the library.
* In the path `hdfs_lib/src/native`, there are static library(`libhdfs.a`) and shared object(`libhdfs.so`) for calling `C APIs` in rust. It helps to guarantee that the project will compile successfully even without the hadoop environment.
* In the file `hdfs_lib/build.rs`, we use `build.rs`[(docs here)](https://doc.rust-lang.org/cargo/reference/build-scripts.html#outputs-of-the-build-script) to pass environment variable `rustc-link-search` to prompt compiler to find the static and shared object.
* In the running time for calling `libhdfs C APIs`, it will using `libhdfs.so`,`libjvm.so`,`Java Environment` and `Hadoop jars`.So,
please ensure that you have finished the following steps, if you want to use the functions for `hdfs`.

### 1. Download hadoop and environment variables
#### 1.1 Requirement:
* Hadoop version >= [2.6.5](http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.5/)
* Java >=1.8
* Linux Environment
* In the running time of `libhdfs C APIs`
* Checking in the hadoop you have installed contains `libhdfs.so` in path `$HADOOP_HOME/lib/native/`.(In the pre-build version, hadoop contains it by default)
* Checking in the java you have installed contains `libjvm.so` in path `$JAVA_HOME/jre/lib/amd64/server/`.

#### 1.2 Environment variables:
Edit shell environment as following command:
```
vim ~/.bashrc

# Change the path to your own and appending to the file
export JAVA_HOME=/{YOUR_JAVA_INSTALLED_PATH}
export JRE_HOME=${JAVA_HOME}/jre
export HADOOP_HOME=/{YOUR_HADOOP_INSTALLED_PATH}
export LD_LIBRARY_PATH=${HADOOP_HOME}/lib/native:${JAVA_HOME}/jre/lib/amd64/server #for libhdfs.o linking
CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
CLASSPATH=${CLASSPATH}":"`find ${HADOOP_HOME}/share/hadoop | awk '{path=path":"$0}END{print path}'` # hadoop's jars
export CLASSPATH
export PATH=${JAVA_HOME}/bin:$HOME/.cargo/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH

#flush the environment variable to all shell session
source ~/.bashrc
```

### 2. Configuring a pseudo hadoop and hdfs environment
Of course, you can build a real cluster by yourself. What we need in the code is the `hdfs path` and `port`.

First of all, entering the configure files directory:`cd $HADOOP_HOME/etc/hadoop`

#### 2.1 Configure `core-site.xml`
```xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>cy</value>
</property>
</configuration>
```

#### 2.2 Configure `hdfs-site.xml`
```xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
```

#### 2.3 Configure `hadoop-env.sh` (Non-essential. Only for JAVA_HOME can't find during starting hdfs)
```
export JAVA_HOME=/{YOUR_JAVA_INSTALLED_PATH}
```
### 3. Starting hdfs and checking hdfs status
* Starting hdfs: `./$HADOOP_HOME/sbin/start-dfs.sh`
You'll get a output as following if you are successful:
```
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [{MACHINE-NAME}]
```
* And you can use `jps` command to verify hdfs status.
```
jps
17248 Jps
16609 DataNode
16482 NameNode
4198 Main
16847 SecondaryNameNode
```
* Now you can open a explorer to visit hdfs page
`http://localhost:9870/dfshealth.html#tab-overview`
The port maybe different in different version hadoop. Please check on hadoop website.

### 4. Testing and using hdfs support
For now, you can using `hdfs support` in this library to read from local pseudo hdfs cluster(or real hdfs cluster).
* In order to avoid tests failure in this library. We mark the tests for `hdfs support` as `ignore`.
So, if you want to test them, please using the following command to test the `hdfs support` independently:
`cargo test -- --ignored`
92 changes: 92 additions & 0 deletions bench-tikv-rocksdb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Benchmark TiKV and Rocksdb

## 1. Benchmark TiKV and Rocksdb on a single machine
I have deployed two pd-servers and each pd-server manages one tikv-server.

(1) The following tests are all based on 100 operations and we record the average time for each operation.
### Insert raw node/edge property operation
| TiKV | Rocksdb |
|------------ |---------------|
| 26~34ms | 55~105ms |

### Extend one raw node/edge property operation
| TiKV | Rocksdb |
|--------------|-----------------|
| 0.24~0.34ms | 0.4~1.3ms |

### Get node/edge property(all) operation
Note that this is not a fair comparison because the Rocksdb's `get` operation in `rust_graph_lib` is simply fetch k-v from memory(it reads all k-v pairs into memory when connecting to it, it is more like the `tikv`'s `batch_get` operation) while `TiKV` needs to connect to pd-server and reads data from disk into memory to return it. With the connection time and reading all kv pairs into memory time counted in, the single get operation for rocksdb is actually around `105ms`.

| TiKV | Rocksdb |
|--------|--------------|
| 3~4ms | 0.03~0.06ms |

(2) The following `batch_get` operation test is based on batchly get 1000 keys and we record the average time.
### Batch get node/edge property(all) operation
| TiKV |
|----------|
| 0.008ms |

## 2. Benchmark TiKV on a cluster
I have deployed two pd-servers and each pd-server manages four tikv-servers(totally there are two pd-servers and eight tikv-servers and they are all on different machines).

### Insert raw node/edge property operation
| TiKV |
|------------ |
| 90ms |

### Extend one raw node/edge property operation
| TiKV |
|------------|
| 0.6~0.9ms |

### Get node/edge property(all) operation
| TiKV |
|--------|
| 4~5ms |

### Batch get node/edge property(all) operation
| TiKV |
|-------------------|
| 0.008ms ~ 0.01ms |

(Batch get 1000 node/edge properties, and it takes 0.008s ~ 0.01s in total)

## 3. `Batch` operations performance comparing between TiKV on a cluster and RocksDB
I have deployed three pd-servers and each pd-server manages three tikv-servers(totally there are three pd-servers and six tikv-servers and they are all on different machines).
And using one server to running test program.

### Batch put operation on DG10
1. current_thread::Runtime

|Batch Size|TiKV|RocksDB|
|---|---|---|
|100|610.306s|873.912s|
|500|450.900s|774.325s|
|1000|531.744s|755.248s|
|10000|1739.592s|751.675s|

2. tokio::Runtime(ThreadPool)

|Batch Size|TiKV|RocksDB|
|---|---|---|
|100|357.481s|776.099s|
|500|427.291s|873.144s|
|1000|528.148s|786.991s|
|10000|2039.108s|784.073s|

### Batch put operation on DG60
|Batch Size|TiKV|RocksDB|
|---|---|---|
|100|2525.971s|13516.203s|
|500|2666.998s|8638.527s|
|1000|2654.622s|8247.292s|
|10000|4054.167s|8104.632s|

### Batch get node/edge property(all) operation
|Batch Size|TiKV|RocksDB|
|---|---|---|
|100|0.078s ~ 0.079s|0.387s ~ 0.425s|
|500|0.382s ~ 0.387s|1.848s ~ 1.938s|
|1000|0.468s ~ 0.476s|3.869s ~ 3.871s|
(RocksDB using `while` iteration to simulate `batch_get`)
Loading