Skip to content

Commit a320bb9

Browse files
jeffreylovitzkereno
authored andcommitted
Add RedisGraph runner to benchmarks
1 parent 0e4d51e commit a320bb9

10 files changed

+547
-0
lines changed

benchmark/redisgraph/README

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
############################################################
2+
# Copyright (c) 2015-now, TigerGraph Inc.
3+
# All rights reserved
4+
# It is provided as it is for benchmark reproducible purpose.
5+
# anyone can use it for benchmark purpose with the
6+
# acknowledgement to TigerGraph.
7+
# Author: Mingxi Wu mingxi.wu@tigergraph.com
8+
############################################################
9+
10+
This article documents the details on how to reproduce the graph database benchmark result on Neo4j.
11+
12+
Data Sets
13+
===========
14+
- graph500 edge file: http://service.tigergraph.com/download/benchmark/dataset/graph500-22/graph500-22
15+
- graph500 vertex file: http://service.tigergraph.com/download/benchmark/dataset/graph500-22/graph500-22_unique_node
16+
17+
- twitter edge file: http://service.tigergraph.com/download/benchmark/dataset/twitter/twitter_rv.tar.gz
18+
- twitter vertex file: http://service.tigergraph.com/download/benchmark/dataset/twitter/twitter_rv.net_unique_node
19+
20+
Hardware & Major enviroment
21+
================================
22+
- Amazon EC2 machine r4.8xlarge
23+
- OS Ubuntu 18.04.1 LTS
24+
- Install the required Python modules with the following commands
25+
$ sudo apt-get update
26+
$ sudo apt-get install build-essential cmake python-pip python-dev
27+
$ sudo pip install --upgrade pip
28+
$ sudo pip install redis click requests config
29+
30+
31+
- 32 vCPUs
32+
- 244 GiB memory
33+
- attached a 250G EBS-optimized Provisioned IOPS SSD (IO1), IOPS we set is 50 IOPS/GiB
34+
35+
RedisGraph Version
36+
==============
37+
Redis version 5.0.3
38+
RedisGraph module v1.0.0
39+
40+
Install Redis and RedisGraph
41+
===============
42+
git clone https://github.com/antirez/redis.git
43+
cd redis
44+
make
45+
sudo apt-get install tcl
46+
make test
47+
git clone https://github.com/RedisLabsModules/RedisGraph.git
48+
cd RedisGraph
49+
git checkout v1.0.0
50+
make
51+
52+
Copy benchmark files:
53+
======================
54+
git clone https://github.com/RedisGraph/graph-database-benchmark.git
55+
56+
Launching Redis
57+
===============
58+
# start server.
59+
~/redis/src/redis-server --loadmodule ~/RedisGraph/src/redisgraph.so &
60+
61+
# to stop server
62+
#redis-cli shutdown
63+
64+
Loading data
65+
==============
66+
nohup ./redisgraph_load_graph500.sh path/to/redisgraph path/to/graph500/data
67+
nohup ./redisgraph_load_twitter.sh path/to/redisgraph path/to/twitter/data
68+
69+
Example: ./redisgraph_load_graph500.sh ~/RedisGraph/ .
70+
71+
Run K-hop neighborhood count benchmark
72+
================
73+
# Change graph500-22-seed and twitter_rv.net-seed path to your seed file path.
74+
# Results will be stored in "result_redisgraph" output directory.
75+
76+
Graph500
77+
-----------------
78+
# 300 seeds, depth 1
79+
nohup python kn.py -g graph500 -s graph500-22-seed -c 300 -d 6 -p redisgraph -l graph500-22_unique_node -t 22 -i 1
80+
# 300 seeds, depth 2
81+
nohup python kn.py -g graph500 -s graph500-22-seed -c 300 -d 6 -p redisgraph -l graph500-22_unique_node -t 22 -i 2
82+
# 10 seeds, depth 3
83+
nohup python kn.py -g graph500 -s graph500-22-seed -c 10 -d 6 -p redisgraph -l graph500-22_unique_node -t 22 -i 3
84+
# 10 seeds, depth 6
85+
nohup python kn.py -g graph500 -s graph500-22-seed -c 10 -d 6 -p redisgraph -l graph500-22_unique_node -t 22 -i 6
86+
87+
Twitter
88+
-------------
89+
# 300 seeds, depth 1
90+
nohup python kn.py -g twitter_rv -s twitter_rv_net-seed -c 300 -d 6 -p redisgraph -l twitter_rv_net_unique_node -t 22 -i 1
91+
# 300 seeds, depth 2
92+
nohup python kn.py -g twitter_rv -s twitter_rv_net-seed -c 300 -d 6 -p redisgraph -l twitter_rv_net_unique_node -t 22 -i 2
93+
# 10 seeds, depth 3
94+
nohup python kn.py -g twitter_rv -s twitter_rv_net-seed -c 10 -d 6 -p redisgraph -l twitter_rv_net_unique_node -t 22 -i 3
95+
# 10 seeds, depth 6
96+
nohup python kn.py -g twitter_rv -s twitter_rv_net-seed -c 10 -d 6 -p redisgraph -l twitter_rv_net_unique_node -t 22 -i 6

benchmark/redisgraph/config.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
############################################################
2+
# Copyright (c) 2015-now, TigerGraph Inc.
3+
# All rights reserved
4+
# It is provided as it is for benchmark reproducible purpose.
5+
# anyone can use it for benchmark purpose with the
6+
# acknowledgement to TigerGraph.
7+
# Author: Mingxi Wu mingxi.wu@tigergraph.com
8+
############################################################
9+
10+
import os
11+
12+
NEO4J_BOLT= os.environ.get("NEO4J_BOLT", "bolt://127.0.0.1:7687")
13+
TIGERGRAPH_HTTP = os.environ.get("TIGERGRAPH_HTTP", "http://127.0.0.1:9000")
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
import os
2+
import sys
3+
import string
4+
5+
# Read the node input file and translate the input IDs into a contiguous range.
6+
# Then, read the relation input file and translate all source and destination node IDs
7+
# to their updated contiguous values.
8+
9+
# User-provided input data directory
10+
if len(sys.argv) < 2 or os.path.exists(sys.argv[1]) == False:
11+
print("Usage: generate_inputs.py [path_to_inputs]")
12+
exit(1)
13+
14+
inputdir = sys.argv[1]
15+
16+
# Input filenames
17+
nodefile = 'graph500-22_unique_node'
18+
relfile = 'graph500-22'
19+
20+
# Output data directory
21+
datadir = 'data'
22+
23+
# Create updated data directory if it doesn't exist
24+
try:
25+
os.mkdir(datadir)
26+
except OSError:
27+
pass
28+
29+
# Count the number of unique nodes in the data set
30+
num_nodes = sum(1 for line in open(os.path.join(inputdir, nodefile)))
31+
32+
updated_id = 0
33+
34+
updated_node_file = open(os.path.join(datadir, nodefile), 'w')
35+
updated_node_file.write('id\n') # Output a header row
36+
updated_relation_file = open(os.path.join(datadir, relfile), 'w')
37+
38+
# Scan the node file to find the highest node ID
39+
max_node = -1
40+
with open(os.path.join(inputdir, nodefile)) as f:
41+
for line in f:
42+
max_node = max(max_node, int(line))
43+
44+
# Map every node ID to its line number
45+
# and generate an updated node file.
46+
placement = [0]*(max_node + 1)
47+
with open(os.path.join(inputdir, nodefile)) as f:
48+
for line in f:
49+
node = int(line)
50+
placement[node] = updated_id
51+
updated_id += 1
52+
updated_node_file.write('%d\n' % (updated_id))
53+
54+
with open(os.path.join(inputdir, relfile)) as f:
55+
for line in f:
56+
# Tokenize every line and convert the data to ints
57+
src, dst = map(int, line.split())
58+
59+
# Retrieve the updated ID of each source and destination
60+
a = placement[src]
61+
b = placement[dst]
62+
63+
# Output the updated edge description
64+
updated_relation_file.write("%d,%d\n" % (a, b))
65+
66+
updated_node_file.close()
67+
updated_relation_file.close()
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
import os
2+
import sys
3+
import string
4+
5+
# Read the node input file and translate the input IDs into a contiguous range.
6+
# Then, read the relation input file and translate all source and destination node IDs
7+
# to their updated contiguous values.
8+
9+
# User-provided input data directory
10+
if len(sys.argv) < 2 or os.path.exists(sys.argv[1]) == False:
11+
print("Usage: generate_inputs.py [path_to_inputs]")
12+
exit(1)
13+
14+
inputdir = sys.argv[1]
15+
16+
# Input filenames
17+
nodefile = 'twitter_rv.net_unique_node'
18+
nodefile_out = 'twitter_rv_net_unique_node'
19+
relfile = 'twitter_rv'
20+
21+
# Output data directory
22+
datadir = 'data'
23+
24+
# Create updated data directory if it doesn't exist
25+
try:
26+
os.mkdir(datadir)
27+
except OSError:
28+
pass
29+
30+
# Count the number of unique nodes in the data set
31+
num_nodes = sum(1 for line in open(os.path.join(inputdir, nodefile)))
32+
33+
updated_id = 0
34+
35+
updated_node_file = open(os.path.join(datadir, nodefile_out), 'w')
36+
updated_node_file.write('id\n') # Output a header row
37+
updated_relation_file = open(os.path.join(datadir, relfile), 'w')
38+
39+
# Scan the node file to find the highest node ID
40+
max_node = -1
41+
with open(os.path.join(inputdir, nodefile)) as f:
42+
for line in f:
43+
max_node = max(max_node, int(line))
44+
45+
# Map every node ID to its line number
46+
# and generate an updated node file.
47+
placement = [0]*(max_node + 1)
48+
with open(os.path.join(inputdir, nodefile)) as f:
49+
for line in f:
50+
node = int(line)
51+
placement[node] = updated_id
52+
updated_id += 1
53+
updated_node_file.write('%d\n' % (updated_id))
54+
55+
with open(os.path.join(inputdir, relfile)) as f:
56+
for line in f:
57+
# Tokenize every line and convert the data to ints
58+
src, dst = map(int, line.split())
59+
60+
# Retrieve the updated ID of each source and destination
61+
a = placement[src]
62+
b = placement[dst]
63+
64+
# Output the updated edge description
65+
updated_relation_file.write("%d,%d\n" % (a, b))
66+
67+
updated_node_file.close()
68+
updated_relation_file.close()

benchmark/redisgraph/graph500-22-seed

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3600312 2677094 2038005 3301167 704219 1779962 2681401 2277366 1649130 806220 3783689 3979771 2878950 1316789 4099483 2654216 3520283 320529 460890 2861567 1676721 3582851 2025534 1897682 3042164 683461 484783 2964318 825304 2303395 3029190 2119218 341236 3921645 3350720 1382338 2497566 2293317 1365818 3108349 1039487 656628 326459 3486463 1513849 3120768 3254104 2859677 4100533 1214662 2844418 3228461 2971789 838862 3242202 231946 103480 745855 2202837 121973 2944986 3916778 1237877 2404335 3903782 3753107 2638320 3532534 3026267 149529 2522099 1565761 1345848 1059426 2994540 1629629 1481421 337894 2706001 342515 2301230 3455722 4103891 2560844 316796 3853684 2803721 2782143 4168065 1297201 2982970 1089600 3589606 1978189 514482 773765 1929789 2499474 1367644 3052548 2020748 1934532 2595851 1265635 2678981 3484689 2778764 323958 1972929 2529296 2638682 2836761 3489646 2304697 3006908 3976118 432800 3408347 3184190 2478197 3990575 3097880 259436 479595 2054949 1014166 2398658 3499821 289302 2689848 603652 2764479 3458769 2372488 3826201 610619 1502380 1417031 1291296 1699680 1816799 2952048 3747093 996609 1906969 712790 1973404 2874441 4072076 534367 2419131 3145715 1172458 2547240 579284 3952328 3217974 928922 2975442 3686619 143324 2262470 2844253 3960743 95176 2661831 289798 498881 459455 3778765 2575099 2321106 898887 1630163 3268706 25081 3747551 2048028 1377545 2178454 3666746 1692598 1809240 1461949 3878592 96570 4095479 2539031 364055 3514283 3843398 3556803 2592596 168 2336570 327991 2445956 1140337 2663510 2514997 1933620 1076164 3734798 99836 2404509 3102298 2158818 3088473 3861233 1453810 1952126 968226 594138 1059034 408333 3246311 587844 1602562 2546319 2861944 1360827 1915610 957424 1427107 433135 3353932 140407 1989222 1392471 1290284 2144691 1299024 764990 302910 4192735 3181076 1535127 263980 1571976 2271738 492328 3976408 1621372 3024237 3229179 2167063 102878 4085765 2370758 2987431 2633916 1177859 1581601 18147 697579 3491436 699069 1608362 2570730 3929663 1304943 3733946 2216412 3013035 261001 32290 113329 1509856 2190260 3103760 3687843 1245035 3341532 857395 3942814 3982809 2807038 3291942 1840809 760204 3108890 1416278 3725922 2189358 2810970 655805 63077 3708992 2622204 1647516 1274701 2238470 83658 3800740 3659055 740181 318596 1353213 3058396 3497001

0 commit comments

Comments
 (0)