SUMO-RL-MobiCharger

SUMO-RL-MobiCharger provides an OpenAI-gym-like environment for the implementation of RL-based mobile charger dispatching methods on the SUMO simulator. The fetures of this environment are four-fold:

A simple and customizable interface to work with Reinforcement Learning for Dispatching of Mobile Chargers on city-scale transportation network with SUMO
Compatibility with OpenAI-gym and popular RL libraries such as stable-baselines3 and RL Baselines3 Zoo
Easy modification of state and reward functions for research focusing on vehicle routing or scheduling problems
Support parallel training of multiple environments via the use of SubprocVecEnv in stable-baselines3


Blue vehicles are mobile chargers, yellow vehicles are electric vehicles, green highlight means charging between mobile chargers and EVs, and blue highlight means charging between mobile chargers and charging stations

Install

Install SUMO >= 1.16.0:

Install SUMO as in their doc. Note that this environment uses Libsumo as default for simulation speedup, but sumo-gui does not work with Libsumo on Windows (more details). If you need to go back to TraCI, uncomment import traci and modify the code in reset() of SumoEnv.

Install the Necessary Packages

Install the necessary packages listed in requirements.txt

Install SUMO-RL-MobiCharger

Clone the latest version and install it in gym

git clone https://github.com/liyan2015/SUMO-RL-MobiCharger.git
cd SUMO-RL-MobiCharger/source
pip install -e .

Training & Testing

In case the environment is not compatible with the up-to-date rl-baselines3-zoo, use this old cloned copy for tuning.

Register SUMO-RL-MobiCharger in RL Baselines3 Zoo

The main class is SumoEnv. To train with RL Baselines3 Zoo, you need to register the environment as in their doc and add the following code to exp_manager.py:

# On most env, SubprocVecEnv does not help and is quite memory hungry
# therefore we use DummyVecEnv by default
if "SumoEnv" not in self.env_name.gym_id:
    env = make_vec_env(
        make_env,
        n_envs=n_envs,
        seed=self.seed,
        env_kwargs=self.env_kwargs,
        monitor_dir=log_dir,
        wrapper_class=self.env_wrapper,
        vec_env_cls=self.vec_env_class,
        vec_env_kwargs=self.vec_env_kwargs,
        monitor_kwargs=self.monitor_kwargs,
    )
else:
    def make_env(
        env_config={
            'gui_f':False, 
            'label':'evaluate'
        }, rank: int = 0, seed: int = 0
        ):
        def _init():
            env = gym.make('SumoEnv-v0', **env_config)
            env = Monitor(env, log_dir)
            env.seed(seed + rank)
            env.action_space.seed(seed + rank)
            return env
        set_random_seed(seed)
        return _init
    
    if eval_env:
        if self.verbose > 0:
            print("Creating evaluate environment.")
            
        env = SubprocVecEnv([make_env() for i in range(n_envs)])
    else:
        env = SubprocVecEnv([make_env(
            {
                'gui_f':False, 
                'label':'train'+str(i+1)
            }, rank=i*2) for i in range(n_envs)])

Training

For training, use the following command line:

python train.py --algo ppo --env SumoEnv-v0 --num-threads 1 --progress --conf-file hyperparams/python/sumoenv_config.py --save-freq 500000 --log-folder /usr/data2/canaltrain_log/ --tensorboard-log /usr/data2/canaltrain_tensorboard/ --verbose 2 --eval-freq 2000000 --eval-episodes 10 --n-eval-envs 10 --vec-env subproc

Resume Training

For resume training with different EV route files, use the following command line or check the doc of RL Baselines3 Zoo:

python train.py --algo ppo --env SumoEnv-v0 --num-threads 1 --progress --conf-file hyperparams/python/sumoenv_config.py --save-freq 500000 --log-folder /usr/data2/canaltrain_log/ --tensorboard-log /usr/data2/canaltrain_tensorboard/ --verbose 2 --eval-freq 2000000 --eval-episodes 10 --n-eval-envs 10 --vec-env subproc -i /usr/data2/canaltrain_log/ppo/SumoEnv-v0_16/rl_model_12999532_steps.zip

Testing

Change the model_path and stats_path in canal_test.py and run:

python canal_test.py

MDP - Observation, Action and Reward

Observation

The default observation for the agent is a vector:

    obs = [SOC_state, charger_state, elig_act_state, dir_state, charge_station_state]

SOC_state indicates the amount of SOC on the road network pending to be refilled by mobile chargers
charger_state indicates current road segment, staying time, charging_others bit, charge_self bit, SOC, distance to target vehicle and neighbor_vehicle bit of each mobile charger
elig_act_state indicates the eligible actions that each mobile charger can take at current road segment
dir_state indicates the best action of each mobile charger given its current road segment
charge_station_state indicates the remaining SOCs that the mobile chargers will have if they go to the charging stations for a recharge

Action

The action space is discrete. Each edge in SUMO network is partitioned into several road segments:

Thus, the possible actions of the agent at each road segment can be illustrated as:

Througout the road network, a mobile charger can only take maximally 6 actions: stay (0), charge vehicles (1), go downstream road segments (2-5).

Reward

The default reward function is defined as:

+ 2 if a mobile charger charges an EV with step_charged_SOC
+ 3 * charger.step_charged_SOC + 0.5 * (1 - before_SOC) if a mobile charger charges itself with step_charged_SOC
+ 8e-2 if a mobile charger takes the best action
- 8e-2 if a mobile charger takes an action different from the best one
- 8e-1 if a mobile charger takes an ineligible action given its current road segment
- 300 if a mobile charger exhausts its SOC
+ 250 if the agent succeeds in charging all the EVs and support the completion of their trips

Citing

If you use this repository, please cite:

@article{yan2022mobicharger,
  title={MobiCharger: Optimal Scheduling for Cooperative EV-to-EV Dynamic Wireless Charging},
  author={Yan, Li and Shen, Haiying and Kang, Liuwang and Zhao, Juanjuan and Zhang, Zhe and Xu, Chengzhong},
  journal={IEEE Transactions on Mobile Computing},
  volume={Early Access}, 
  year={2022},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
accessories		accessories
hyperparams		hyperparams
source		source
trained_agents		trained_agents
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
canal_test.py		canal_test.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SUMO-RL-MobiCharger

Install

Install SUMO >= 1.16.0:

Install the Necessary Packages

Install SUMO-RL-MobiCharger

Training & Testing

Register SUMO-RL-MobiCharger in RL Baselines3 Zoo

Training

Resume Training

Testing

MDP - Observation, Action and Reward

Observation

Action

Reward

Citing

About

Releases

Packages

Languages

License

liyan2015/SUMO-RL-MobiCharger

Folders and files

Latest commit

History

Repository files navigation

SUMO-RL-MobiCharger

Install

Install SUMO >= 1.16.0:

Install the Necessary Packages

Install SUMO-RL-MobiCharger

Training & Testing

Register SUMO-RL-MobiCharger in RL Baselines3 Zoo

Training

Resume Training

Testing

MDP - Observation, Action and Reward

Observation

Action

Reward

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages