Codes and Dataset for the Paper "Large Language Models Can Solve Real-World Planning Rigorously with Formal Verification Tools".
-
Create a conda environment and install dependency:
conda create -n fmtravelplanner python=3.9 conda activate fmtravelplanner pip install -r requirements.txt
-
The UnsatChristmas dataset is provided in
database_small
folder. You can run interactive plan repair experiment with this database. -
To run satisfiable plan generation experiment, refer to paper "TravelPlanner: A Benchmark for Real-World Planning with Language Agents" and their github repo to download their database and train/validation/test set.
The file for satisfiable plan generation experiment is test_travelplanner.py
. An example command is python test_travelplanner.py --set_type train --model_name gpt
Note: You might want to use the training set to adjust the prompts for different LLMs. You can add customized checker for steps and codes to further improve the performance.
- Run test_travelplanner_interactive.py for unsatisfiable interactive plan repair experiment for TravelPlanner. Follow the instructions in file to first collect initial codes and then do the plan repair.
- Run test_unsat.py for unsatisfiable interactive plan repair experiment for UnsatChristmas. Follow the instructions in file to first collect initial codes and then do the plan repair.
The prompts we used are included in the prompts
folder