The aim of the project is to develop a multi-agent reinforcement learning algorithm that uses quality diversity to create the sets of agents to solve a given multi agent task. The project is part of the master thesis developed by Nielsen Erik at the University of Trento. This repo contains the code and relevant sources used to developed the thesis project. The project is supervised by Giovanni Iacca and Andrea Ferigo from University of Trento and follows their current researches.
In src are stored all the scripts developed during the project. The produced scripts and code are based on the following papers:
- A Population-Based Approach for Multi-Agent Interpretable Reinforcement Learning
- Quality Diversity Evolutionary Learning of Decision Trees
The project is developed in python 3.11. Here are the steps to install the project
git clone https://github.com/NielsenErik/MultiAgent_and_QualityDiversity_ReinforcementLearning
cd MultiAgent_and_QualityDiversity_ReinforcementLearning
pip install -r requirements.txt
- The project is developed in python 3.11. It is recommended to use a virtual environment to install the project and its dependencies:
python3 -m venv pyenv-ma-qd
source pyenv-ma-qd/bin/activate
or
python3 -m venv pyenv-ma-qd
chmod +x script.sh
./script.sh
- Magent2 is the test environment of the project. To install it, first is required to clone the project and then install the downloaded repository:
git clone https://github.com/Farama-Foundation/MAgent2
cd MAgent2
pip install -e .
This solution was proposed by Issue #19 of the MAgent2 repository
To run the project, if the installation is done by creating a virtual environment, first is required to activate the virtual environment:
source venv/bin/activate
Then, to run the project, execute the following command:
chmod +x script.sh
./script.sh
The script will pop different running options, choose the desired option and the project will start running.
The project is structured as follows:
- src: Contains the source code of the project
- agents: Contains the agents classes and algorithms used in the project
- algorithm: Contains the algorithm regarding Map-Elites and Quality Diversity, developed using PyRibs, and the classes for Genetic Algorithm and Genetic Programming
- config: Contains the configuration files used in the project, such as the configuration of the environment, the algorithm, and the agents and most importantly the configuration of Map-Elites archive
- decisiontrees: Contains the classes for create and manage the Decision Trees, RL-Decision Trees, Leaves and the Conditions on the Trees Nodes
- utils: Contains the utility functions used in the project
- marl_qd_launcher.py: The main script to launch the project
- experiment_launcher.py: The script that contains experiments classes and types
- test_team.py: The script that runs the test of the teams generated during the training executions
- eval_runs.py: The script that generate all the plots and the data files over the training and test executions
- logs: Contains the logs files generated during the training and test executions of the project, as well as the results of the experiments and the final teams and agents generated
- hpc_scripts: Contains the scripts used to run the project on the High-Performance Computing (HPC) cluster
All the results and executions of the project are stored in the logs folder. The results of out proposal are summerize in the following plots:
Fitness trends during training | MAP-ELites heatmaps of our approach | Final teams reward comparison between approaches |
---|---|---|
In references there is a comprehensive list of references of the studied papers to complete the project.
In src are stored all the scripts developed during the project. The produced scripts are based on the following papers by Giovanni Iacca, Marco Crespi, Andrea Ferigo, Leonardo Lucio Custode: