Skip to content

Multi-Objective Multi Agent Reinforcement Learning (MOMARL) experiments.

License

Notifications You must be signed in to change notification settings

doesburg11/MOMARL

Repository files navigation

Multi-Objective Multi-Agent Reinforcement Learning Experiments

Resource Gathering (MORL)

Starting point for this project is resource gathering, a modification of a morl-baselines example. This is a multi-objective single agent solution.


Item Gathering (MOMARL)

A MOMARL environment with random policy agents.

Multi-Objective Beach Problem Domain (MO-BPD)

In the Multi-Objective Beach Problem Domain (MO-BPD), each agent represents a tourist starting at a specific beach section, and then deciding at which section of the beach they will spend their day. Agents can choose to move to an adjacent section (move_left or move_right), or to stay_still. Each beach section is characterised by a certain capacity and each agent is characterised by one of two static types: A or B. These properties, together with the location of the agents on the beach sections, determine the vectorial reward received by agents, having two conflicting objectives: “capacity” and “mixture”. The environment can be configured in two modes, “individual” or “team” reward: the agents can either receive their own individual local rewards, based on the beach section they are located in (i.e., individual reward setting), or the global reward, based on the sum of rewards over all the available beach sections (i.e., team reward setting).

The Pareto Front is simulated with the Independent Q-Learners algorithm:


Furthermore, a MO-BPD solution is also applied with the Cooperative Discrete MOMAPPO algorithm.

References

Releases

No releases published

Packages

No packages published

Languages