Dynamic resource allocation in networked systems is necessary to achieve end-to-end management objectives. Previous research has demonstrated that reinforcement learning is a promising approach to this problem, allowing to obtain near-optimal resource allocation policies for non-trivial system configurations. Despite these advances, a significant drawback of current approaches is that they require expensive and slow retraining whenever the target system changes. We address this drawback and introduce an efficient approach to adapt a given base policy to dynamic system changes. In our approach, we adapt the base policy through rollout and online play, which transforms the base policy into a rollout policy.
The following figure shows our approach for policy adaptation in networked systems. During each control cycle, the system model
gym
andgymnasium
: for creating the RL environmentsjoblib
: for loading/exporting random forest regressor modelssb3-contrib
: for reinforcement learning agents (Maskable PPO)scikit-learn
: for random forest regressionscipy
: for random forest regressionstable-baselines3
: for reinforcement learning agents (PPO)torch
andtorchvision
: for neural network trainingmatplotlib
: for plottingpandas
: for data wranglingrequests
: for making HTTP requests
- Python 3.8+
flake8
(for linting)tox
(for automated testing)
# install from pip
pip install online_policy_adaptation_using_rollout==<version>
# local install from source
$ pip install -e online_policy_adaptation_using_rollout
# or (equivalently):
make install
# force upgrade deps
$ pip install -e online_policy_adaptation_using_rollout --upgrade
# git clone and install from source
git clone https://github.com/foroughsh/online_policy_adaptation_using_rollout
cd online_policy_adaptation_using_rollout
pip3 install -e .
# Install development dependencies
$ pip install -r requirements_dev.txt
cd examples; python run_scenario_1.py
cd examples; python run_scenario_2.py
cd examples; python run_scenario_3.py
Creative Commons (C) 2023-2024, Forough Shahabsamani, and Kim Hammar
- Forough Shahabsamani foro@kth.se
- Kim Hammar kimham@kth.se