Skip to content

A paper describing the optimizer for SQL/PGQ, which utilizes graph optimization for improving sql queries

Notifications You must be signed in to change notification settings

GraphScope/relgo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Guidance

Step 1: Environment

Install Java 11, Protobuf, and GOpt.

for MacOS

Install java11

java <= 11.0.19 will cause unexpected errors, java==11.0.27 is recommended
e.g., brew install openjdk@11

nano ~/.zshrc
export JAVA_HOME="java/home/path"
export PATH="/java/home/path/to/bin:$PATH"

source ~/.zshrc

Install Protobuf

Download source code from https://github.com/protocolbuffers/protobuf/releases?page=9.
// e.g., Protocol Buffers v21.3

cd protobuf-21.3
./autogen.sh
./configure --host=arm64-apple-darwin --prefix=$HOME/protobuf
make -j4
make install

nano ~/.zshrc

// add protobuf to $PATH and $CMAKE_PREFIX_PATH
export PATH="$HOME/protobuf/bin:$PATH"
export CMAKE_PREFIX_PATH="$HOME/protobuf/include:$CMAKE_PREFIX_PATH"
export CMAKE_PREFIX_PATH="$HOME/protobuf/bin:$CMAKE_PREFIX_PATH"
export CMAKE_PREFIX_PATH="$HOME/protobuf/lib:$CMAKE_PREFIX_PATH"

source ~/.zshrc

Install GOpt

git clone -b ir_fix_type_infer https://github.com/shirly121/GraphScope.git
cd GraphScope/interactive_engine
mvn clean package -DskipTests -Pgraph-planner-jni
cd assembly/target
tar xvzf graph-planner-jni.tar.gz

replace the folder `resource/graph-planner-jni` with the obtained folder

for Linux

Install java11

java>=11.0.25 is recommended

Install Protobuf

protobuf v3.21.12 is recommended

Install GOpt

Same as for Mac

Step 2: Download the Repo

git clone https://github.com/GraphScope/relgo.git --recurse-submodules
cd relgo-duckdb

Step 3: Apply Patch to DuckDB

cd duckdb
git apply ../patch/0001-graph_shell_linux.patch // for linux
git apply ../patch/0001-graph_shell.patch   // for others

Step 4: Compile the Code

cd duckdb
mkdir build
cd build
cmake ..
make

Step 5: Run Shell

Copy the graph-planner-jni folder generated by GOpt to the resource folder. Then, the executable should be under the path relgo-duckdb/resource/graph-planner-jni/bin/.

cd duckdb/build
./duckdb -unsigned

You can enter DuckDB query statements to perform personalized database construction and querying, or you can directly execute our pre-prepared scripts for initial testing.

Example 1: Entering DuckDB query statements

set threads to 1;

create table person (
    p_personid bigint not null PRIMARY KEY,
    p_firstname varchar not null,
    p_lastname varchar not null,
    p_gender varchar not null,
    p_birthday varchar not null,
    p_creationdate bigint not null,
    p_locationip varchar not null,
    p_browserused varchar not null
 );

 create table knows (
    k_person1id bigint not null,
    k_person2id bigint not null,
    k_creationdate bigint not null,
    FOREIGN KEY (k_person1id) REFERENCES person(p_personid),
    FOREIGN KEY (k_person2id) REFERENCES person(p_personid)
);

COPY person            FROM '../../resource/graph/sf01_merge/dynamic/person_0_0.csv'                      (DELIMITER '|', HEADER);

COPY knows (k_person1id, k_person2id, k_creationdate) FROM '../../resource/graph/sf01_merge/dynamic/person_knows_person_0_0.csv' (DELIMITER '|', HEADER);

COPY knows (k_person2id, k_person1id, k_creationdate) FROM '../../resource/graph/sf01_merge/dynamic/person_knows_person_0_0.csv' (DELIMITER '|', HEADER);

// use the path to the `.duckdb_extension` file
LOAD './extension/relgo/relgo.duckdb_extension';

CALL relgo_create_graph_index('CREATE SELF RAI knows_r ON knows (FROM k_person1id REFERENCES person.p_personid, TO k_person2id REFERENCES person.p_personid);');


CALL init_graph_stats();

CALL execute_graph_query('SELECT g.p2_id, g.p2_lastname, g.p2_birthday, g.p2_creationdate, g.p2_gender, g.p2_browserused, g.p2_locationip FROM GRAPH_TABLE (graph MATCH (p1:person)-[:knows]-(p2:person) WHERE p1.p_personid = 933L AND p2.p_firstname = "Karl" COLUMNS ( p2.p_personid as p2_id, p2.p_lastname as p2_lastname, p2.p_birthday as p2_birthday, p2.p_creationdate as p2_creationdate, p2.p_gender as p2_gender, p2.p_browserused as p2_browserused, p2.p_locationip as p2_locationip ) ) g;');

CALL explain_graph_query('SELECT g.p2_id, g.p2_lastname, g.p2_birthday, g.p2_creationdate, g.p2_gender, g.p2_browserused, g.p2_locationip FROM GRAPH_TABLE (graph MATCH (p1:person)-[:knows]-(p2:person) WHERE p1.p_personid = 933L AND p2.p_firstname = "Karl" COLUMNS ( p2.p_personid as p2_id, p2.p_lastname as p2_lastname, p2.p_birthday as p2_birthday, p2.p_creationdate as p2_creationdate, p2.p_gender as p2_gender, p2.p_browserused as p2_browserused, p2.p_locationip as p2_locationip ) ) g;');

Example 2: Use the pre-prepared scripts

.read ../../sqls/init.sql

CALL execute_graph_query('SELECT g.p_personid, g.p_firstname, g.p_lastname FROM GRAPH_TABLE (graph MATCH (p1:person)-[:knows]-(f:person)<-[:comment_hascreator]-(c:comment)-[:comment_replyof_post]->(:post)-[:post_tag]->(t:tag)-[:tag_hastype]->(tc1:tagclass)-[:issubclassof]->(tc2:tagclass) WHERE p1.p_personid = 28587302323430L AND tc2.tc_name = "Agent" COLUMNS (f.p_personid AS p_personid, f.p_firstname AS p_firstname, f.p_lastname AS p_lastname )) g;');


CALL explain_graph_query('SELECT g.p2_id, g.p2_lastname, g.p2_birthday, g.p2_creationdate, g.p2_gender, g.p2_browserused, g.p2_locationip FROM GRAPH_TABLE (graph MATCH (p1:person)-[:knows]-(p2:person) WHERE p1.p_personid = 933L AND p2.p_firstname = "Karl" COLUMNS ( p2.p_personid as p2_id, p2.p_lastname as p2_lastname, p2.p_birthday as p2_birthday, p2.p_creationdate as p2_creationdate, p2.p_gender as p2_gender, p2.p_browserused as p2_browserused, p2.p_locationip as p2_locationip ) ) g;');

Please note that in the current version, only when a graph index is build on a table (either as a vertex or an edge), the optimizer can recognize the role that the table plays (is a vertex/edge in the graph). Therefore, a table is used in function execute_graph_query only if it has been used in function relgo_create_graph_index.

About

A paper describing the optimizer for SQL/PGQ, which utilizes graph optimization for improving sql queries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages