first commit

liyang619 · May 20, 2023 · db9e767 · db9e767
commit db9e767
Show file tree

Hide file tree

Showing 845 changed files with 1,163,785 additions and 0 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/.gitattributes b/.gitattributes
@@ -0,0 +1 @@
+*.pb filter=lfs diff=lfs merge=lfs -text
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,26 @@
+trajs/*
+res/
+sub_data/
+traj_ring/
+questionnaire/*
+questionnaire_data/
+trajs_data_backup/
+trajs_ring/
+trajs_data/
+backup/
+logs/
+models/*
+*.ipynb
+*.zip
+
+*.pyc
+__pycache__
+node_modules/
+not_used/
+overcookedgym/overcooked-flask/node_module
+yarn.lock
+*.npy
+data.zip
+models.zip
+# large files
+corridor_am.pkl
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -0,0 +1,2 @@
+{
+}
diff --git a/LICENSE.md b/LICENSE.md
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2022 Stanford Intelligent and Interactive Autonomous Systems Group
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -0,0 +1,4 @@
+include website/schema.sql
+graft website/static
+graft website/templates
+global-exclude *.pyc
diff --git a/README.md b/README.md
@@ -0,0 +1,140 @@
+# Introdution
+This repo is the code for human-AI experiments on Overcooked for [COLE-JAIR](https://sites.google.com/view/cole-jair).
+
+This repo is based on the code of [PECAN](https://github.com/LxzGordon/PECAN).
+
+We integrate [Human-Aware-RL](https://github.com/HumanCompatibleAI/human_aware_rl/tree/neurips2019) agent models with the [PantheonRL](https://github.com/Stanford-ILIAD/PantheonRL) framework for convenient human-ai coordination study on Overcooked. Changes are done under the [overcookedgym/overcooked-flask](https://github.com/LxzGordon/pecan_human_AI_coordination/tree/master/overcookedgym/overcooked-flask) directory.
+<p align="center">
+  <img src="./images/pecan_uni.gif" width="40%">
+  <img src="./images/pecan_simple.gif" width="40%">
+  <br>
+</p>
+
+# Instruction for usage
+
+## 1. Create conda environment & Install libraries
+Install [PantheonRL](https://github.com/Stanford-ILIAD/PantheonRL) in this repo
+ ```shell
+    conda create -n overcooked-vis python=3.7
+    conda activate overcooked-vis
+    pip install -r requirements.txt
+    pip install -e .
+```
+
+Install mpi4py
+
+```shell
+conda install mpi4py
+```
+
+Install PyTorch (based on your CUDA version): https://pytorch.org/
+(You don't actually need the GPU version to run the game)
+
+
+Install human_aware_rl and its dependencies: overcooked_ai, baselines & stable_baselines
+ ```shell
+    cd overcookedgym/human_aware_rl
+    pip install -e .
+    cd overcooked_ai
+    pip install -e .
+    cd ..
+    cd stable-baselines
+    pip install -e .
+    cd ..
+    cd baselines
+    pip install -e .
+```
+
+
+## 2. How to load models
+
+You need to put your model file in `./models`. You can get our trained models [here](https://drive.google.com/drive/folders/1s88a_muyG6pVlfcKDKop6R1Fhxr8dcGH?usp=share_link).
+
+In addition, you can load your own models if they are trained using the [Human-Aware-RL](https://github.com/HumanCompatibleAI/human_aware_rl/tree/neurips2019) framework. 
+Agents are loaded using the `get_agent_from_saved_model()` method, which loads tensorflow predictor models (`.pb` files), so you should save your agents in this style if you wish to load them into our framework. You can reference to the `save` method in `human_aware_rl/pbt/pbt.py` for saving agents that can be loaded.
+
+To load your own models, you need to put them in the `./models` folder in a named folder (the folder names need to be the same for all layouts), and the models would be loaded upon starting the server. For example. If your algo is named `ABC`, then the folder structure should look like this:
+```
+-- models
+  | --  simple
+       | -- SP        <---- Baseline 1 
+       | -- PBT       <---- Baseline 2
+          ...
+       | -- ABC       <---- Your Algorithm
+  | --  unident_s
+       | -- SP        <---- Baseline 1 
+       | -- PBT       <---- Baseline 2
+          ...
+       | -- ABC       <---- Your Algorithm
+  | --  random1
+       | -- SP        <---- Baseline 1 
+       | -- PBT       <---- Baseline 2
+          ...
+       | -- ABC       <---- Your Algorithm
+  ...
+``` 
+
+## 3. Start a process
+
+```shell
+python overcookedgym/overcooked-flask/app.py --trajs_savepath ./trajs --ckpts ./models
+```
+
+- `--ckpts`: Folder containing all the AI models to be loaded. Default is `./models`.
+- `--port`: The port where you run the server process.
+- `--trajs_savepath`: Optional trajectory save path, default is `./trajs`.
+- `--questionnaire_savepath`: Optional questionnaire save path, default is `./questionnaire`.
+- `--ip`: Default is LOCALHOST, we **recommend you replace it with your public network IP**, because of a known bug of Flask that may cause extreme lag when playing the game. The same applies when debugging, you should visit your machine's IP in your browser instead of LOCALHOST.
+
+## 4. Customize your experiment settings
+
+### Customize experiment statements
+You can replace `configs/statement.md` by your experiment statement markdown file, then restarting your web process.
+
+### Customize before game questionnaire.
+You can modify `configs/before_game.yaml` to customize your settings of before game questionnaire.
+
+## 5. Collecting data
+Questionnaire data are saved in `./questionnaire`, its corresponging co-play trajectorys is saved in `./trajs`.
+
+We also privide a simple data processing scripts named `questionnaire_analyze.ipynb.`
+
+<!-- COLE with bc on random1
+
+```shell
+python overcookedgym/overcooked-flask/app.py --layout_name simple --ego models/COLE/simple/seed_1234/best --alt bc
+``` -->
+
+
+<!-- # Citation
+Please cite
+ ```
+@article{lou2023pecan,
+  title={PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination},
+  author={Lou, Xingzhou and Guo, Jiaxian and Zhang, Junge and Wang, Jun and Huang, Kaiqi and Du, Yali},
+  journal={arXiv preprint arXiv:2301.06387},
+  year={2023}
+}
+ ```
+
+ ```
+ @inproceedings{sarkar2022pantheonrl,
+  title={PantheonRL: A MARL Library for Dynamic Training Interactions},
+  author={Sarkar, Bidipta and Talati, Aditi and Shih, Andy and Sadigh, Dorsa},
+  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
+  volume={36},
+  number={11},
+  pages={13221--13223},
+  year={2022}
+}
+ ```
+
+ ```
+@article{carroll2019utility,
+  title={On the utility of learning about humans for human-ai coordination},
+  author={Carroll, Micah and Shah, Rohin and Ho, Mark K and Griffiths, Tom and Seshia, Sanjit and Abbeel, Pieter and Dragan, Anca},
+  journal={Advances in neural information processing systems},
+  volume={32},
+  year={2019}
+}
+ ``` -->
diff --git a/bctrainer.py b/bctrainer.py
@@ -0,0 +1,104 @@
+import argparse
+import json
+
+from pantheonrl.algos.bc import BC
+from pantheonrl.common import trajsaver
+from pantheonrl.common.multiagentenv import SimultaneousEnv
+
+from trainer import (generate_env, ENV_LIST, LAYOUT_LIST)
+
+
+class EnvException(Exception):
+    """ Raise when parameters do not align with environment """
+
+
+def input_check(args):
+    # Env checking
+    if args.env == 'OvercookedMultiEnv-v0':
+        if 'layout_name' not in args.env_config:
+            raise EnvException(f"layout_name needed for {args.env}")
+        elif args.env_config['layout_name'] not in LAYOUT_LIST:
+            raise EnvException(
+                f"{args.env_config['layout_name']} is not a valid layout")
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        description='''\
+            BC algorithm given a trajectory
+            ''')
+
+    parser.add_argument('env',
+                        choices=ENV_LIST,
+                        help='The environment to train in')
+
+    parser.add_argument('trajectory',
+                        type=str,
+                        help='Location of trajectory')
+
+    parser.add_argument('--choose-alt',
+                        action='store_true',
+                        help='Train from the alt trajectory (default is ego)')
+
+    parser.add_argument('--total-epochs', '-t',
+                        type=int,
+                        default=10,
+                        help='Number of episodes to run')
+
+    parser.add_argument('--l2',
+                        type=float,
+                        default=0,
+                        help='Value of l2 weight of BC algorithm')
+
+    parser.add_argument('--device', '-d',
+                        default='auto',
+                        help='Device to run pytorch on')
+
+    parser.add_argument('--env-config',
+                        type=json.loads,
+                        default={},
+                        help='Config for the environment')
+
+    parser.add_argument('--framestack', '-f',
+                        type=int,
+                        default=1,
+                        help='Number of observations to stack')
+
+    parser.add_argument('--save',
+                        help='File to save the agent into')
+
+    args = parser.parse_args()
+    args.record = None
+
+    input_check(args)
+
+    print(f"Arguments: {args}")
+    env, altenv = generate_env(args)
+    print(f"Environment: {env}; Partner env: {altenv}")
+
+    if isinstance(env, SimultaneousEnv):
+        TransitionsClass = trajsaver.SimultaneousTransitions
+    else:
+        TransitionsClass = trajsaver.TurnBasedTransitions
+
+    if args.choose_alt:
+        env = altenv
+
+    transition = TransitionsClass.read_transition(
+            args.trajectory, env.observation_space, env.action_space)
+
+    if args.choose_alt:
+        data = transition.get_alt_transitions()
+    else:
+        data = transition.get_ego_transitions()
+
+    clone = BC(observation_space=env.observation_space,
+               action_space=env.action_space,
+               expert_data=data,
+               l2_weight=args.l2,
+               device=args.device)
+
+    clone.train(n_epochs=args.total_epochs)
+    if args.save is not None:
+        clone.save_policy(args.save)
diff --git a/configs/before_game.yaml b/configs/before_game.yaml
@@ -0,0 +1,8 @@
+name: False
+email: False
+phone: False
+age: True
+gender: True
+gameskill: False
+is_played: True
+level: True
diff --git a/configs/statement.md b/configs/statement.md
@@ -0,0 +1,31 @@
+# Experimental Statement
+## 1. Purpose
+You have been asked to participate in a research study that studies human-AI coordination. We would like your permission to enroll you as a participant in this research study.
+
+The instruments involved in the experiment are a computer screen and a keyboard. The experimental task consisted of playing  the computer game Overcooked and manipulating the keyboard to coordinate with the AI agent to cook and serve dishes.
+
+## 2. Procedure
+In this study, you should read the experimental instructions and ensure that you understand the experimental content. The whole experiment process lasts about ** minutes, and the experiment is divided into the following steps:
+
+(1) Read and sign the experimental statement, and you need to fill in a questionnaire；
+
+(2) Test the experimental instrument, and adjust the seat height, sitting posture, and the distance between your eyes and the screen. Please ensure that you are in a comfortable sitting position during the experiment；
+
+(3) You will be paired with an agent trained by behavior clone algorithm in a demo layout. You should comprehend the specific instrument operation rules and be familiar with the experimental process in the demo layout；
+
+(4) Start the formal experiment. Please cooperate with the AI agent to play with two agents as much scores as possible. You need to fill in a questionnaire after each game;
+
+(5) After the experiment, you need to fill in a questionnaire.
+
+## 3. Risks and Discomforts
+The only potential risk factor for this experiment is trace electron radiation from the computer. Relevant studies have shown that radiation from computers and related peripherals will not cause harm to the human body.
+
+## 4. Costs
+Each participant who completes the experiment and fills correct individual information will be paid 50 ~ 75 RMB according to your performance.
+
+## 5. Confidentiality
+The results of this study may be published in an academic journal/book or used for teaching purposes. However, your name or other identifiers will not be used in any publication or teaching materials without your specific permission. In addition, if photographs, audio tapes or videotapes were taken during the study that would identify you, then you must give special permission for their use. 
+
+I confirm that the purpose of the research, the study procedures and the possible risks and discomforts as well as potential benefits that I may experience have been explained to me. All my questions have been satisfactorily answered. I have read this consent form. Clicking the button below indicates my willingness to participate in this study.
+
+
diff --git a/images/agent_selection_screen.png b/images/agent_selection_screen.png
diff --git a/images/pecan_simple.gif b/images/pecan_simple.gif
diff --git a/images/pecan_uni.gif b/images/pecan_uni.gif
diff --git a/images/training_screen.png b/images/training_screen.png
diff --git a/overcookedgym/OvercookedAdaptPartnerInstructions.md b/overcookedgym/OvercookedAdaptPartnerInstructions.md
@@ -0,0 +1,37 @@
+# Adaptive Partner Experiments. (**Make sure to be in the base directory of PantheonRL**)
+
+#### Train bunch of partners
+```
+python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 10 --preset 1
+python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 11 --preset 1
+python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 12 --preset 1
+python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 13 --preset 1
+python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 14 --preset 1
+```
+
+#### Train to play against group of partners
+```
+python3 trainer.py OvercookedMultiEnv-v0 PPO FIXED FIXED FIXED FIXED --alt-config \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-10"}' \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-11"}' \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-12"}' \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-13"}' \
+    --env-config '{"layout_name":"simple"}' --seed 20 -t 1000000 --preset 1
+
+python3 trainer.py OvercookedMultiEnv-v0 ModularAlgorithm FIXED FIXED FIXED FIXED --ego-config '{"marginal_reg_coef": 0.5}' --alt-config \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-10"}' \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-11"}' \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-12"}' \
+    '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-13"}' \
+    --env-config '{"layout_name":"simple"}' --seed 21 -t 1000000 --preset 1
+```
+
+#### Adapt to new partner
+```
+python3 trainer.py OvercookedMultiEnv-v0 PPO FIXED --alt-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-14"}' --env-config '{"layout_name":"simple"}' --seed 30 --preset 1
+
+python3 trainer.py OvercookedMultiEnv-v0 LOAD FIXED --ego-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-ego-20"}' --alt-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-14"}' --env-config '{"layout_name":"simple"}' --seed 31 --preset 1
+
+python3 trainer.py OvercookedMultiEnv-v0 LOAD FIXED --ego-config '{"type":"ModularAlgorithm", "location":"models/OvercookedMultiEnv-v0-simple-ModularAlgorithm-ego-21"}' --alt-config '{"type":"PPO", "location":"models/OvercookedMultiEnv-v0-simple-PPO-alt-14"}' --env-config '{"layout_name":"simple"}' --seed 32 --preset 1
+
+```