This project contains the implementation of a multi-agent reinforcement learning (MARL) system for path planning in dynamic environments. The implementation is built using C++ and uses Python for visualizing the results. The implementation is mostly divided into classes that represent different components of the system, such as the A* algorithm, the environment, the learning algorithms, and the visualization tool. Our implementation supports parallel processing of different sub-environments, as well as the parallel processing of agents in a single sub-environment, used by the federated Q-learning algorithm.
-
To install the project, either fork it directly from GitHub or download it as a zip file via the green
Codebutton on the top right of the repository page. If you choose to download it as a zip file, extract the contents to your desired location. -
Navigate to the project root directory in your terminal.
project-root/
|
|-- plots/ # results.csv + results_detailed.csv + plots generated by the regular experiments.
|-- plots-edge-case/ # results.csv + results_detailed.csv + plots generated by the edge case experiment.
|-- src/ # Source code of this project.
| |-- astar.(h|cpp) # A* algorithm for pathfinding in the complete environment.
| |-- constants.h # Constant values used throughout the implementation.
| |-- experiments.(h|cpp) # Simulation of environment changes and the experiment setup.
| |-- hashpair.(h|cpp) # HashPair class, used in the A* algorithm to efficiently store and retrieve found paths.
| |-- main.cpp # Calls the function to run the experiments.
| |-- maze.(h|cpp) # MDP (Markov Decision Process) implementation of the maze environment.
| |-- multiagent.(h|cpp) # Federated Q-learning implementation (fedAsynQ_EqAvg and fedAsynQ_ImAvg).
| |-- pathstate.h # PathState class, used when constructing paths to a charging station.
| |-- policyvisualizer.(h|cpp) # PolicyVisualizer class, used to visualize the policies of the agents in the environment.
| |-- singleagent.(h|cpp) # Single agent Q-learning implementation.
| |-- startstats.(h|cpp) # StartStats class, used when selecting the starting positions of the agents (prioritized replay).
| |-- table.(h|cpp) # Table class, used as the Q-table for the agents (three-dimensional vector).
| |-- testpolicy.(h|cpp) # Test the learned policy of the agents in the environment.
| |-- threadresult.h # ThreadResult class, used to store the results of threads created for parallel learning of agents.
| |-- treenode.(h|cpp) # TreeNode class, representing a node in the hierarchical tree.
| |-- treestrategy.(h|cpp) # TreeStrategy class, implementing the hierarchical tree strategy and the parallel processing of tree nodes.
| |-- visualizations.py # Python script to visualize the results of the experiments.
|-- CMakeLists.txt # CMake build configuration file.
|-- README.md # Project overview and instructions to run experiments.
|-- arial.ttf # Font file used in the PolicyVisualizer class to visualize policies.
-
Ensure you have a C++ compiler installed on your system. This project has been tested with the g++ compiler, which is part of the GNU Compiler Collection (GCC). If you don't have it installed, you can install it via your package manager. On Ubuntu, you can install it with the following command:
sudo apt-get install g++
-
Ensure you have CMake installed on your system. If you are using the CLion IDE, it comes with CMake pre-installed. If you are using a different IDE or the command line, you can install CMake via your package manager. On Ubuntu, you can install it with the following command:
sudo apt-get install cmake
-
Install the SFML library, which is used for visualizing the learned policies of the agents in the environment. The SFML library is a cross-platform multimedia library that provides a simple interface to various components such as graphics, audio, and network. You can install it via your package manager or download it from the SFML website. On Ubuntu, you can install it with the following command:
sudo apt-get install libsfml-dev
Ensure that version 2.6.1 (preferred) or higher is installed, as the project requires features from this version.
-
Install the Python dependencies for visualizing the results. The visualization script requires the following Python packages:
matplotlibpandas
You can install these packages by creating a virtual environment and using pip to install the required packages. Here are the steps to take:
python3 -m venv venv # Create a virtual environment named 'venv'source venv/bin/activate # Activate the virtual environment (Linux/Mac)
pip3 install matplotlib pandas # Install the required Python packages using pip
- In the root directory of the project, open the CLion IDE via the following command in your terminal:
clion . - When prompted, click Trust Project to allow the IDE to access the project files.
- When prompted with the Project Wizard, tick the checkbox to reload the CMake project on editing
CMakeLists.txtand click OK. - Change the
CMAKE_RUNTIME_OUTPUT_DIRECTORYvariable in theCMakeLists.txtfile to the absolute path of the project’s root directory. - Click the hammer icon to build the project (the initial build may take some time).
- Click the green play icon to start running the experiments.
running-the-experiments.mp4
-
Inspect both the generated
results.csvandresults_detailed.csvfiles in the project's root directory.- The
results.csvfile contains averages of the results of the environment simulations. - The
results_detailed.csvfile contains detailed results of the environment simulations.
- The
-
Set the
plots_dirvariable in thevisualizations.pyscript to the desired output directory for the plots (set toplotsin the demo). -
Run the
visualizations.pyscript from the project's root directory to generate the plots from the results. You can run the script using the following command:python3 src/visualizations.py
-
The generated plots are saved in the
plotsdirectory.
visualizations-experiments.mp4
- In the
experiments.cppfile, locate thesizes,difficulties, andapproacheslists between lines 65 and 80.- The
sizeslist contains the different environment sizes to be used in the experiments. You can modify this list to include other sizes. - The
difficultieslist contains the different difficulty levels of the environments. You can modify this list to include other configurations. - The
approacheslist contains the different approaches to be used in the experiments. You can remove any approach from this list, but no other approaches than these six are supported:A* StaticA* OracleonlyTrainLeafNodessingleAgentfedAsynQ_EqAvgfedAsynQ_ImAvg
- The
-
In the
experiments.cppfile, locate the line that sets the seed (line 101) and change it tosrand(d + 100), as indicated by the comment. -
Modify the
sizeslist to only include sizes 20 and 50. Leave thedifficultiesandapproacheslists unchanged. -
Click the hammer icon to build the project again.
-
Click the green play icon to start running the edge case experiment.
-
The edge case example is the last environment that is run in the experiments, i.e., the hard 50x50 environment.
running-edge-case.mp4
-
Inspect both the generated
results.csvandresults_detailed.csvfiles in the project's root directory.- The
results.csvfile contains averages of the results of the edge case simulations. - The
results_detailed.csvfile contains detailed results of the edge case simulations.
- The
-
Manually remove all entries from both the
results.csvandresults_detailed.csvfiles that are not related to the edge case environment (i.e., all entries except those with size 50x50 and hard difficulty). -
Set the
plots_dirvariable in thevisualizations.pyscript to the desired output directory for the plots (plots-edge-case). -
Run the
visualizations.pyscript from the project's root directory to generate the plots from the edge case results. You can run the script using the following command:python3 src/visualizations.py
-
The generated plots are saved in the
plots-edge-casedirectory. Since the visualization script by default generates plots for the five different environment sizes and three difficulty levels, many of the generated plots will be empty. The only plots that contain data are the following:adapt_time_box_50x50_hard.pngadapt_time_line_hard.pngavg_path_length_line_hard.pngcumulative_adapt_time_50x50_hard.pnginitial_training_time_hard.pngsuccess_rate_line_hard.pngsuccess_rate_vs_timestep_50x50_hard.png
-
The other plots can be ignored or simply deleted, as they do not contain any data to visualize.
-
To enable the policy tracker/visualization tool, set the argument of the
runFullExperimentfunction in themain.cppfile totrue. -
Choose appropriate environment sizes (e.g., 20x20 or 50x50) and one or multiple approaches for which the visualization is implemented (
singleAgent,fedAsynQ_EqAvg, andfedAsynQ_ImAvg). -
Click the hammer icon to build the project again.
-
Click the green play icon to start running the experiments with the policy tracker enabled.
The first example shows the visualization of the singleAgent approach in a 20x20 environment. The visualization clearly shows the most optimal action from each position/state in the environment.
policy-visualization-20.mp4
The second example shows the visualization of the fedAsynQ_EqAvg approach in a 50x50 environment. The movement of obstacles is still clearly visible, but the optimal actions are not shown as clearly as in the first example.
policy-visualization-50.mp4
This project is released under the MIT License. Please review the License file for more details.