Reinforcement Learning β’ Task-Aware KG Construction β’ GraphRAG
β¨Shifting from building "good" graphs to building demonstrably "useful" onesβ¨
The effectiveness of Graph Retrieval-Augmented Generation (GraphRAG) is often hindered by a fundamental disconnect: the Knowledge Graph (KG) construction process is decoupled from the downstream task it's meant to serve. AutoGraph-R1 is the first framework to bridge this gap by framing KG construction as a Reinforcement Learning (RL) problem. An LLM "constructor" agent is trained with rewards based on the generated graph's functional utility in a live RAG pipeline, directly optimizing for task performance.
- π€ RL-Optimized KG Construction: Trains an LLM to build graphs that are verifiably useful for a downstream RAG task.
- π Task-Aware Reward Functions: Includes two novel reward functions to optimize graphs as either direct knowledge carriers or as powerful knowledge indices.
- π Two-Stage Pipeline: A clear separation between the graph constructor training stage and the inference/benchmarking stage.
- π¬ Reproducible Benchmarking: Provides scripts to reproduce our results and evaluate custom-built knowledge graphs on multiple QA benchmarks.
This guide covers the environment setup for both the training and inference stages. All packages should be installed in the same environment.
The training and inference stages require a system with an NVIDIA GPU and a compatible CUDA toolkit.
- Install CUDA: Install the appropriate CUDA and cuDNN version for your GPU.
- Refer to the NVIDIA CUDA Toolkit documentation (CUDA 12.6 was installed for VeRL) for official installation instructions.
- Verify Installation: Check your CUDA version by running:
nvcc --version
Install the core libraries for deep learning and the RL agent loop.
-
PyTorch and Transformers Ensure compatibility with your CUDA version. Our code was tested with:
- PyTorch:
v2.7.1+cu126(refer to previous versions for your specific CUDA build) - Transformers:
v4.53.3
# Example for CUDA 12.6 - adjust for your system pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu126 pip install transformers==4.53.3 - PyTorch:
-
VeRL (for the RL agent loop) Our modifications are based on
v0.5.0.dev0.- Install VeRL by following the official VeRL installation guide.
-
Note: A detailed agent loop setup tutorial using VeRL is available here (in Chinese).
For the inference stage, an additional package is required for the KG creation pipeline.
- Atlas-RAG
We use
v0.0.5branch of atlas-rag. Install it in the same environment:git clone -b release/v0.0.5 https://github.com/HKUST-KnowComp/AutoSchemaKG.git cd AutoSchemaKG pip install -e .
The training scripts require the musique_hotpotqa_graph_retriever and musique_hotpotqa_graph_text_retriever dataset. We provide a script to download it from the Hugging Face Hub.
- Run the download script:
DATASET="gzone0111/musique_hotpotqa_graph_retriever" python scripts/download_dataset.py --repo_id $DATASET --output_path ./data
This will download the train and validation splits and save them as train.parquet and validation.parquet in the ./data directory. Ensure the paths in the training scripts point to these files.
The AutoGraph-R1 pipeline consists of a training stage and an inference stage.
Before running any script, you must configure the API endpoints for your language models. These models will be served using vllm.
Edit the config.ini file in autograph/rag_server/ to match the ports you will use to serve your models. The defaults align with our provided scripts.
[vllm]
URL = http://0.0.0.0:8129/v1
KEY = EMPTY
[vllm_emb]
URL = http://0.0.0.0:8128/v1
KEY = EMPTYThis stage uses RL to fine-tune an LLM to build effective knowledge graphs.
Hardware Note: The following scripts are configured for 2xH100 GPUs. You may need to adjust
gpu_memory_utilization,,trainer.n_gpus_per_nodeetc in the scripts and theCUDA_VISIBLE_DEVICESenvironment variable for your specific hardware.
1. Launch the LLM API Servers
First, launch the language models that will act as the environment (generator) and the embedding model for the RL loop. Open two separate terminal sessions for these.
- Terminal 1: Launch Embedding Model Server:
bash scripts/vllm_serve/qwen3-0.6b-emb.sh
- Terminal 2: Launch Generator Model Server (For 3B model):
bash scripts/vllm_serve/qwen2.5-7b-vllm.sh
2. Run the Training Script
In a third terminal, run the RL training loop. Choose one of the following scripts based on the desired reward function.
-
To train with the Graph Retriever reward (graph as a knowledge carrier):
# For a 3B parameter agent bash scripts/autograph-r1/run_qwen2.5-3b_instruct_graph.sh # For a 7B parameter agent (ensure generator server is not running) bash scripts/autograph-r1/run_qwen2.5-7b_instruct_graph.sh
-
To train with the Graph-Based Text Retriever reward (graph as a knowledge index):
# For a 3B parameter agent bash scripts/autograph-r1/run_qwen2.5-3b_instruct_with_distract-iterative-hipporag-2.sh # For a 7B parameter agent bash scripts/autograph-r1/run_qwen2.5-7b-instruct_with_distract-iterative-hipporag-2.sh
Once trained, convert the checkpoint and use it to build and evaluate a knowledge graph.
1. Convert FSDP Checkpoint to Hugging Face Format
VeRL saves checkpoints in FSDP format. Convert them for easy hosting. You can follow the official VeRL tutorial or run the command below.
# Replace CHECKPOINT_PATH with the trainer.default_local_dir from your training script
# and STEP_NUM with the checkpoint step you want to convert (e.g., 50).
CHECKPOINT_PATH="path/to/your/checkpoints"
STEP_NUM="50"
python3 -m verl.model_merger merge \
--backend fsdp \
--local_dir $CHECKPOINT_PATH/global_step_$STEP_NUM/actor \
--target_dir $CHECKPOINT_PATH/global_step_$STEP_NUM/actor/huggingface2. Host the Fine-Tuned Model with vLLM
Serve your converted Hugging Face model as an API endpoint (you can also use it with sglang).
# Adjust CHECKPOINT_PATH and STEP_NUM as needed
CHECKPOINT_PATH="path/to/your/checkpoints"
STEP_NUM="50"
CUDA_VISIBLE_DEVICES=0,1 vllm serve $CHECKPOINT_PATH/global_step_$STEP_NUM/actor/huggingface \
--host 0.0.0.0 \
--port 8111 \
--gpu-memory-utilization 0.9 \
--tensor-parallel-size 2 \
--max-model-len 163843. Knowledge Graph Construction
Use your fine-tuned model to extract a KG from a text corpus. Edit the script to point to your model and data.
- Arguments: Pass the
model_name(the path to your fine-tuned model checkpoint) and other parameters inside the script or via the command line. - Run the script (Example):
# Adjust the API url in the python script as needed python benchmark/autograph/custom_kg_extraction.py --model_name $CHECKPOINT_PATH/global_step_$STEP_NUM/actor/huggingface
- Output: The constructed knowledge graph will be saved to the specified output directory.
- For argument details, please refer to the script.
2. RAG Benchmarking
Evaluate the performance of the generated KG using our benchmarking scripts. Ensure the model endpoints and KG paths in the scripts are correctly set. You have to set the KG paths with model_name.
(For embedding and reader models, you can run the scripts in benchmark/vllm_serve for serving them.)
-
Method 1: Graph Retriever Benchmark:
python benchmark/autograph/benchmarking_graph.py --model_name $CHECKPOINT_PATH/global_step_$STEP_NUM/actor/huggingface
-
Method 2: Graph-Based Text Retriever Benchmark:
python benchmark/autograph/benchmarking_text.py --model_name $CHECKPOINT_PATH/global_step_$STEP_NUM/actor/huggingface
If you use AutoGraph-R1 in your research, please cite our paper:
@misc{tsang2025autographr1endtoendreinforcementlearning,
title={AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction},
author={Hong Ting Tsang and Jiaxin Bai and Haoyu Huang and Qiao Xiao and Tianshi Zheng and Baixuan Xu and Shujie Liu and Yangqiu Song},
year={2025},
eprint={2510.15339},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.15339},
}
Hong Ting TSANG (Dennis) (httsangaj@connect.ust.hk)
