Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Yang Li authored and Yang Li committed May 20, 2023
0 parents commit db9e767
Show file tree
Hide file tree
Showing 845 changed files with 1,163,785 additions and 0 deletions.
Binary file added .DS_Store
Binary file not shown.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.pb filter=lfs diff=lfs merge=lfs -text
26 changes: 26 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
trajs/*
res/
sub_data/
traj_ring/
questionnaire/*
questionnaire_data/
trajs_data_backup/
trajs_ring/
trajs_data/
backup/
logs/
models/*
*.ipynb
*.zip

*.pyc
__pycache__
node_modules/
not_used/
overcookedgym/overcooked-flask/node_module
yarn.lock
*.npy
data.zip
models.zip
# large files
corridor_am.pkl
2 changes: 2 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{
}
21 changes: 21 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2022 Stanford Intelligent and Interactive Autonomous Systems Group

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
4 changes: 4 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
include website/schema.sql
graft website/static
graft website/templates
global-exclude *.pyc
140 changes: 140 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Introdution
This repo is the code for human-AI experiments on Overcooked for [COLE-JAIR](https://sites.google.com/view/cole-jair).

This repo is based on the code of [PECAN](https://github.com/LxzGordon/PECAN).

We integrate [Human-Aware-RL](https://github.com/HumanCompatibleAI/human_aware_rl/tree/neurips2019) agent models with the [PantheonRL](https://github.com/Stanford-ILIAD/PantheonRL) framework for convenient human-ai coordination study on Overcooked. Changes are done under the [overcookedgym/overcooked-flask](https://github.com/LxzGordon/pecan_human_AI_coordination/tree/master/overcookedgym/overcooked-flask) directory.
<p align="center">
<img src="./images/pecan_uni.gif" width="40%">
<img src="./images/pecan_simple.gif" width="40%">
<br>
</p>

# Instruction for usage

## 1. Create conda environment & Install libraries
Install [PantheonRL](https://github.com/Stanford-ILIAD/PantheonRL) in this repo
```shell
conda create -n overcooked-vis python=3.7
conda activate overcooked-vis
pip install -r requirements.txt
pip install -e .
```

Install mpi4py

```shell
conda install mpi4py
```

Install PyTorch (based on your CUDA version): https://pytorch.org/
(You don't actually need the GPU version to run the game)


Install human_aware_rl and its dependencies: overcooked_ai, baselines & stable_baselines
```shell
cd overcookedgym/human_aware_rl
pip install -e .
cd overcooked_ai
pip install -e .
cd ..
cd stable-baselines
pip install -e .
cd ..
cd baselines
pip install -e .
```


## 2. How to load models

You need to put your model file in `./models`. You can get our trained models [here](https://drive.google.com/drive/folders/1s88a_muyG6pVlfcKDKop6R1Fhxr8dcGH?usp=share_link).

In addition, you can load your own models if they are trained using the [Human-Aware-RL](https://github.com/HumanCompatibleAI/human_aware_rl/tree/neurips2019) framework.
Agents are loaded using the `get_agent_from_saved_model()` method, which loads tensorflow predictor models (`.pb` files), so you should save your agents in this style if you wish to load them into our framework. You can reference to the `save` method in `human_aware_rl/pbt/pbt.py` for saving agents that can be loaded.

To load your own models, you need to put them in the `./models` folder in a named folder (the folder names need to be the same for all layouts), and the models would be loaded upon starting the server. For example. If your algo is named `ABC`, then the folder structure should look like this:
```
-- models
| -- simple
| -- SP <---- Baseline 1
| -- PBT <---- Baseline 2
...
| -- ABC <---- Your Algorithm
| -- unident_s
| -- SP <---- Baseline 1
| -- PBT <---- Baseline 2
...
| -- ABC <---- Your Algorithm
| -- random1
| -- SP <---- Baseline 1
| -- PBT <---- Baseline 2
...
| -- ABC <---- Your Algorithm
...
```

## 3. Start a process

```shell
python overcookedgym/overcooked-flask/app.py --trajs_savepath ./trajs --ckpts ./models
```

- `--ckpts`: Folder containing all the AI models to be loaded. Default is `./models`.
- `--port`: The port where you run the server process.
- `--trajs_savepath`: Optional trajectory save path, default is `./trajs`.
- `--questionnaire_savepath`: Optional questionnaire save path, default is `./questionnaire`.
- `--ip`: Default is LOCALHOST, we **recommend you replace it with your public network IP**, because of a known bug of Flask that may cause extreme lag when playing the game. The same applies when debugging, you should visit your machine's IP in your browser instead of LOCALHOST.

## 4. Customize your experiment settings

### Customize experiment statements
You can replace `configs/statement.md` by your experiment statement markdown file, then restarting your web process.

### Customize before game questionnaire.
You can modify `configs/before_game.yaml` to customize your settings of before game questionnaire.

## 5. Collecting data
Questionnaire data are saved in `./questionnaire`, its corresponging co-play trajectorys is saved in `./trajs`.

We also privide a simple data processing scripts named `questionnaire_analyze.ipynb.`

<!-- COLE with bc on random1
```shell
python overcookedgym/overcooked-flask/app.py --layout_name simple --ego models/COLE/simple/seed_1234/best --alt bc
``` -->


<!-- # Citation
Please cite
```
@article{lou2023pecan,
title={PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination},
author={Lou, Xingzhou and Guo, Jiaxian and Zhang, Junge and Wang, Jun and Huang, Kaiqi and Du, Yali},
journal={arXiv preprint arXiv:2301.06387},
year={2023}
}
```
```
@inproceedings{sarkar2022pantheonrl,
title={PantheonRL: A MARL Library for Dynamic Training Interactions},
author={Sarkar, Bidipta and Talati, Aditi and Shih, Andy and Sadigh, Dorsa},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={36},
number={11},
pages={13221--13223},
year={2022}
}
```
```
@article{carroll2019utility,
title={On the utility of learning about humans for human-ai coordination},
author={Carroll, Micah and Shah, Rohin and Ho, Mark K and Griffiths, Tom and Seshia, Sanjit and Abbeel, Pieter and Dragan, Anca},
journal={Advances in neural information processing systems},
volume={32},
year={2019}
}
``` -->
104 changes: 104 additions & 0 deletions bctrainer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
import argparse
import json

from pantheonrl.algos.bc import BC
from pantheonrl.common import trajsaver
from pantheonrl.common.multiagentenv import SimultaneousEnv

from trainer import (generate_env, ENV_LIST, LAYOUT_LIST)


class EnvException(Exception):
""" Raise when parameters do not align with environment """


def input_check(args):
# Env checking
if args.env == 'OvercookedMultiEnv-v0':
if 'layout_name' not in args.env_config:
raise EnvException(f"layout_name needed for {args.env}")
elif args.env_config['layout_name'] not in LAYOUT_LIST:
raise EnvException(
f"{args.env_config['layout_name']} is not a valid layout")


if __name__ == '__main__':
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter,
description='''\
BC algorithm given a trajectory
''')

parser.add_argument('env',
choices=ENV_LIST,
help='The environment to train in')

parser.add_argument('trajectory',
type=str,
help='Location of trajectory')

parser.add_argument('--choose-alt',
action='store_true',
help='Train from the alt trajectory (default is ego)')

parser.add_argument('--total-epochs', '-t',
type=int,
default=10,
help='Number of episodes to run')

parser.add_argument('--l2',
type=float,
default=0,
help='Value of l2 weight of BC algorithm')

parser.add_argument('--device', '-d',
default='auto',
help='Device to run pytorch on')

parser.add_argument('--env-config',
type=json.loads,
default={},
help='Config for the environment')

parser.add_argument('--framestack', '-f',
type=int,
default=1,
help='Number of observations to stack')

parser.add_argument('--save',
help='File to save the agent into')

args = parser.parse_args()
args.record = None

input_check(args)

print(f"Arguments: {args}")
env, altenv = generate_env(args)
print(f"Environment: {env}; Partner env: {altenv}")

if isinstance(env, SimultaneousEnv):
TransitionsClass = trajsaver.SimultaneousTransitions
else:
TransitionsClass = trajsaver.TurnBasedTransitions

if args.choose_alt:
env = altenv

transition = TransitionsClass.read_transition(
args.trajectory, env.observation_space, env.action_space)

if args.choose_alt:
data = transition.get_alt_transitions()
else:
data = transition.get_ego_transitions()

clone = BC(observation_space=env.observation_space,
action_space=env.action_space,
expert_data=data,
l2_weight=args.l2,
device=args.device)

clone.train(n_epochs=args.total_epochs)
if args.save is not None:
clone.save_policy(args.save)
8 changes: 8 additions & 0 deletions configs/before_game.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: False
email: False
phone: False
age: True
gender: True
gameskill: False
is_played: True
level: True
31 changes: 31 additions & 0 deletions configs/statement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Experimental Statement
## 1. Purpose
You have been asked to participate in a research study that studies human-AI coordination. We would like your permission to enroll you as a participant in this research study.

The instruments involved in the experiment are a computer screen and a keyboard. The experimental task consisted of playing the computer game Overcooked and manipulating the keyboard to coordinate with the AI agent to cook and serve dishes.

## 2. Procedure
In this study, you should read the experimental instructions and ensure that you understand the experimental content. The whole experiment process lasts about ** minutes, and the experiment is divided into the following steps:

(1) Read and sign the experimental statement, and you need to fill in a questionnaire;

(2) Test the experimental instrument, and adjust the seat height, sitting posture, and the distance between your eyes and the screen. Please ensure that you are in a comfortable sitting position during the experiment;

(3) You will be paired with an agent trained by behavior clone algorithm in a demo layout. You should comprehend the specific instrument operation rules and be familiar with the experimental process in the demo layout;

(4) Start the formal experiment. Please cooperate with the AI agent to play with two agents as much scores as possible. You need to fill in a questionnaire after each game;

(5) After the experiment, you need to fill in a questionnaire.

## 3. Risks and Discomforts
The only potential risk factor for this experiment is trace electron radiation from the computer. Relevant studies have shown that radiation from computers and related peripherals will not cause harm to the human body.

## 4. Costs
Each participant who completes the experiment and fills correct individual information will be paid 50 ~ 75 RMB according to your performance.

## 5. Confidentiality
The results of this study may be published in an academic journal/book or used for teaching purposes. However, your name or other identifiers will not be used in any publication or teaching materials without your specific permission. In addition, if photographs, audio tapes or videotapes were taken during the study that would identify you, then you must give special permission for their use.

I confirm that the purpose of the research, the study procedures and the possible risks and discomforts as well as potential benefits that I may experience have been explained to me. All my questions have been satisfactorily answered. I have read this consent form. Clicking the button below indicates my willingness to participate in this study.


Binary file added images/agent_selection_screen.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/pecan_simple.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/pecan_uni.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/training_screen.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 37 additions & 0 deletions overcookedgym/OvercookedAdaptPartnerInstructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Adaptive Partner Experiments. (**Make sure to be in the base directory of PantheonRL**)

#### Train bunch of partners
```
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 10 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 11 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 12 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 13 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 14 --preset 1
```

#### Train to play against group of partners
```
python3 trainer.py OvercookedMultiEnv-v0 PPO FIXED FIXED FIXED FIXED --alt-config \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-10"}' \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-11"}' \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-12"}' \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-13"}' \
--env-config '{"layout_name":"simple"}' --seed 20 -t 1000000 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 ModularAlgorithm FIXED FIXED FIXED FIXED --ego-config '{"marginal_reg_coef": 0.5}' --alt-config \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-10"}' \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-11"}' \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-12"}' \
'{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-13"}' \
--env-config '{"layout_name":"simple"}' --seed 21 -t 1000000 --preset 1
```

#### Adapt to new partner
```
python3 trainer.py OvercookedMultiEnv-v0 PPO FIXED --alt-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-14"}' --env-config '{"layout_name":"simple"}' --seed 30 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 LOAD FIXED --ego-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-ego-20"}' --alt-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-14"}' --env-config '{"layout_name":"simple"}' --seed 31 --preset 1
python3 trainer.py OvercookedMultiEnv-v0 LOAD FIXED --ego-config '{"type":"ModularAlgorithm", "location":"models/OvercookedMultiEnv-v0-simple-ModularAlgorithm-ego-21"}' --alt-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-14"}' --env-config '{"layout_name":"simple"}' --seed 32 --preset 1
```
Loading

0 comments on commit db9e767

Please sign in to comment.