This repo provides the PolyBench and PolyG implementation of the paper PolyG: Adaptive Graph Traversal for Diverse GraphRAG Questions.
- Create a new env with python 3.12:
conda create -n polyg python=3.12- Install all dependencies:
pip install -r requirements.txt- Install Neo4j 2025.03.0:
Refer to Neo4j Installation.
- Install PolyG (from the root directory of this repo):
pip install -e .- Install the dataset:
For the dataset (knowledge graphs), please refer to RGBench.
Store the knowledge graphs into datasets directory (from the root directory of this repo).
- Convert the graphs into desired formats:
Go to preprocess_dataset directory, run
python preprocess_graph.py --path dataset/physics
python preprocess_graph.py --path dataset/goodreads
python preprocess_graph.py --path dataset/amazon- Import the data to Neo4j:
At the preprocess_dataset directory, run
bash neo4j_bulk_insert.sh- PolyBench:
Our proposed PolyBench is available in benchmarks directory.
- Run a toy example (on the physics graph):
At the examples directory, run
python example.py --model openai/gpt-4o --data_dir ../datasets/physics- Run end-to-end evaluation on PolyBench:
At the examples directory, run
python experiment.py --model openai/gpt-4o --data_dir ../datasets/physics --benchmark_dir ../benchmarks/physics
python experiment.py --model openai/gpt-4o --data_dir ../datasets/goodreads --benchmark_dir ../benchmarks/goodreads
python experiment.py --model openai/gpt-4o --data_dir ../datasets/amazon --benchmark_dir ../benchmarks/amazonResults will be stored in examples/results/[graph name]/[model name]/results.jsonl.
To evaluate the results, checkout to examples/evaluation directory, run:
# for win rates
python judge_by_llm.py --model openai/gpt-4o --dataset physics
python judge_by_llm.py --model openai/gpt-4o --dataset goodreads
python judge_by_llm.py --model openai/gpt-4o --dataset amazon
# for F1-score, Precision, Recall, Accuracy and Hit
python compute_f1_hit.py --model openai/gpt-4o --dataset physics
python compute_f1_hit.py --model openai/gpt-4o --dataset goodreads
python compute_f1_hit.py --model openai/gpt-4o --dataset amazonResults will be saved in examples/results/[graph name]/[model name]/judegments.jsonl and examples/results/[graph name]/[model name]/detailed_evaluation.jsonl.