This is the official repository for the following two papers:
- Hiding Leader's Identity in Leader-Follower Navigation through Multi-Agent Reinforcement Learning
Ankur Deka, Wenhao Luo, Huao Li, Michael Lewis, Katia Sycara
Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021
Paper Link: Arxiv, IROS - Human vs. Deep Neural Network Performance at a Leader Identification Task
Ankur Deka, Michael Lewis, Huao Li, Phillip Walker, Katia Sycara*
Accepted to Human Factors and Ergonomics Society (HFES) Annual Meeting 2021
Paper Link: PITT, HFES
I have tested this repository with Python 3.6 on Ubuntu 18.04. First install Anaconda and then run:
git clone git@github.com:Ankur-Deka/Hiding-Leader-Identity.git
cd Hiding-Leader-Identity
conda create python=3.6 pip --name HidingIdentity
conda activate HidingIdentity
pip install -r requirements.txt
Note: gym_vecenv
MUST be installed from the link in requirements.txt
. pip install gym_vecenv
will NOT give the same results!
Download the folder marlsave
from this Drive link and store it in the root directory.
- Naive MARL
python joint_main.py --mode test --load-mode individual --swarm-load-run 0 --swarm-load-ckpt latest --adversary-load-run 0 --adversary-load-ckpt latest --out-dir naive_marl --plot-trajectories --record --goal-at-top --seed 0
- Our proposed leader identity hiding policy in Paper 1
python joint_main.py --mode test --load-mode joint --load-run 1 --load-ckpt latest --out-dir leader_hiding --plot-trajectories --record --goal-at-top --seed 0
- Co-training - stage 4 in Paper 2 I am providing 3 training runs (2,3 and 4) due to stochasticity of results (refer to Paper 2).
python joint_main.py --mode test --load-mode joint --load-run 2 --load-ckpt latest --out-dir co_training --plot-trajectories --record --goal-at-top --seed 0
- Scripted PD
python joint_main.py --mode test --algo scripted --load-mode individual --adversary-load-run 4 --adversary-load-ckpt latest --out-dir scripted_pd --plot-trajectories --record --goal-at-top --seed 0
- Zheng et al.
python -W ignore joint_main_genetic.py --algo genetic --adversary-hidden-dim 512 --num-processes 1 --mode test --load-mode joint --load-run 5 --load-ckpt latest --adversary-version V2 --out-dir zheng --plot-trajectories --record --goal-at-top --seed 0
There are multiple stages of training as described in the above mentioned papers: Stage 1 to 3 in Paper 1, Stage 1 to 4 in Paper 2. For each stage, for both train and test mode, we need to joint_main.py
or adversary_training/main.py
with the right arguments as explained below.
python joint_main.py --mode train --use-adversary 0
Saves training files in a folder marlsave/run_n1
. n1
is generated automatically starting from 0 and increasing everytime we run train. Trained model checkpoints and tensorboard logs are saved here. It is important to note down the run number n1
for use below.
Replace --load-run 0
with the same n1
as above. load-ckpt
can be latest
or a valid number. Results are saved in output/stage_1
python joint_main.py --mode test --use-adversary 0 --load-mode joint --load-run 0 --load-ckpt latest --out-dir stage_1 --record --goal-at-top
- Generate trajectory data
Replace--load-run 0
with the right value ofn1
corresponding to training in stage 1.
python joint_main.py --mode test --use-adversary 0 --load-mode joint --load-run 0 --load-ckpt latest --out-dir stage_1_train --num-eval-episodes 1000
python joint_main.py --mode test --use-adversary 0 --load-mode joint --load-run 0 --load-ckpt latest --out-dir stage_1_test --num-eval-episodes 100 --goal-at-top
Saves trajectories in output/stage_1_train/trajs
and output/stage_1_test/trajs
.
- Create a dataset folder
mkdir -p trajectory_datasets/dataset_1
mv output/stage_1_train/trajs trajectory_datasets/dataset_1/train_dataset
mv output/stage_1_test/trajs trajectory_datasets/dataset_1/test_dataset
- Train adversary
cd adversary_training
python main.py --mode train --dataDir ../trajectory_datasets/dataset_1
This will save training files in runs/run_n2
where n2
is generated automatically. It is important to note down n2
for use below.
- Test adversary
Replace0
in--swarm-load-run 0
withn1
. Replace0
in--adversary-load-run n2
withn2
.
cd ..
python joint_main.py --mode test --load-mode individual --swarm-load-run 0 --swarm-load-ckpt latest --adversary-load-run 0 --adversary-load-ckpt latest --out-dir stage_2_results --plot-trajectories --record --goal-at-top
Replace 0
in --adversary-load-run 0
with n2
.
python joint_main.py --mode train --load-mode individual --adversary-load-run 0 --adversary-load-ckpt latest --train-adversary 0
Saves training files in a folder marlsave/run_n2
.
Replace 1
in --load-run 0
with n2
.
python joint_main.py --mode test --load-mode joint --load-run 1 --load-ckpt latest --out-dir stage_3_results --plot-trajectories --record --goal-at-top
This is our proposed policy in Paper 1.
python joint_main.py --mode train
Replace --load-run 1
python joint_main.py --mode test --load-mode joint --load-run 1 --load-ckpt latest --out-dir stage_4_results --plot-trajectories --record --goal-at-top
set algo=scripted
python joint_main.py --mode test --use-adversary 0 --algo scripted --out-dir scripted_pd_train --num-eval-episodes 1000
python joint_main.py --mode test --use-adversary 0 --algo scripted --out-dir scripted_pd_test --num-eval-episodes 100 --goal-at-top
mkdir -p trajectory_datasets/dataset_scripted_pd
mv output/scripted_pd_train/trajs trajectory_datasets/dataset_scripted_pd/train_dataset
mv output/scripted_pd_test/trajs trajectory_datasets/dataset_scripted_pd/test_dataset
cd adversary_training
python main.py --mode train --dataDir ../trajectory_datasets/dataset_scripted_pd
cd ..
python joint_main.py --mode test --algo scripted --load-mode individual --adversary-load-run 4 --adversary-load-ckpt latest --out-dir scripted_pd_results --plot-trajectories --record --goal-at-top --seed 0
cd adversary_training
python main.py --mode train --dataDir ../trajectory_datasets/dataset_genetic_pretraining --lr 0.025 --version V2 --hiddenDim 512 --optimizer SGD
cd ..
cmd python -W ignore joint_main_genetic.py --adversary-num-trajs 100 --algo genetic --adversary-load-ckpt latest --num-frames 1000000 --adversary-hidden-dim 512 --num-processes 1 --mode train --env-name simple_flocking --adversary-load-run 8 --load-mode individual --adversary-num-epochs 1 --adversary-version V2
Test
python -W ignore joint_main_genetic.py --algo genetic --adversary-hidden-dim 512 --num-processes 1 --mode test --env-name simple_flocking --load-mode joint --load-run 228 --load-ckpt latest --adversary-version V2
python gen_plot_data.py --load-run 32
Use run_grid_search.py
file to generate multiple videos together. Open swarm_training/output/video_previewing_tool/video_preview.html
on browser (tested on Firefox 75.0 beta and Chrome Version 83.0.4103.61 (Official Build) (64-bit)). Browse and select the videos you wish to play.
--store-video-together
to store videos in common folder
- For checking results of different ckpts of same run (1) pass an array of 'load-ckpt': [10,20,30,40,50,60,70,80,90,100], (2) 'store-video-together': [''], (3) DON'T pass 'out_dir' - auto generates out-dir names
Arguments are defined in arguments.py
num_frames
: No. of environment frames to train onnum_iters
:num_frames // num_processes
update_every
: Updated after this many framesnum_updates
: No. of updates for eachupdate_every
batch_size
buffer_size
: Should be larger than max possible episode length
load-mode = {individual, joint}
individual
: loads fromswarm-load-path
andadversary-load-path
joint
: loads fromload-path
joint_run.py
provides a convenient way to run multiple experiments. Earlier, I didn't have any problem but lately I'm having issues training in parallel. run_ID of one experiment is clashing with another
web_form
contains complete web based UI. In our experiments we hosted on a Apache web server running on a google cloud instance.- Data is saved in
user_data.txt
.cd web_form && python conv2csv.py user_data.txt
to convert to csv format. If there is any space/new line at the end ofuser_data.txt
, remove it before converting.
out_files
: trajectories of swarm robots, one episode per fileadversary_training
: code for training adversaryPrototyping_notebook.ipynb
runs
: checkpoints and tensorboardmain.py
: training/validatingdataset.py
: dataset classmodels.py
: model classes
swarm_training
: code for training swarmmain.py
: main file for runningarguments.py
: argumentslearner.py
: Learner - master object for the swarmruns
: checkpoints and tensorboard
mape
: environment
simple_flocking
- goal reaching, leader observes goal locationsimple_trajectory
- trajectory following, leader observes next loc on trajectory (crude implementation). 'is_success' is always False- Successful if all agents are within a thershold distance to the goal
- Reward: difference in distance + additional goal on task completion (disable right now)
- Goal at the top half of the window during training and at y = 0.9 during testing (to get roughly uniform goal reaching time for human trials)
- Adversary cannot see the goal
- Implementation details:
done
is a list withTrue
/False
repeated num_agents times. All should have save value. Env is reset (bygym_vecenv
orPseudoVecEnv
) even if one of them isTrue
info['env_done']
contains done for overall teaminfo['is_success']
contains is_success for overall teamenv
reset when it's done by gym_vecenv or PseudoVecEnv, last obs stored in info['terminal_observation']mape/multiagent/environment
: generates multiagent environmentmape/environment/scenarios/simple_flocking
: goal reaching environment
- stored in
output/args.out_file
adversary_preds
has csv files for different episodes. First column is true leader ID in corresponding video, second column is adversary's prediction.
- Pyglet:
File "./mape/multiagent/rendering.py", line 120, in render
arr = np.fromstring(image_data.data, dtype=np.uint8, sep='')
AttributeError: 'ImageData' object has no attribute 'data'
Solution: pip install pyglet==1.3.2
swarm_training
directory is adapted from: marl_transfer.- Parts of the code for Zheng et al. are adapted from the corresponding private_flocking repository.