Fix docs and typos

kennyderek · Jul 16, 2021 · 5edf23c · 5edf23c
1 parent a355fbe
commit 5edf23c
Show file tree

Hide file tree

Showing 4 changed files with 22 additions and 14 deletions.
diff --git a/README.md b/README.md
@@ -4,12 +4,18 @@
 We want this project to be accessible to everyone. We use CPUs to train our models, using the framework called RLLib. In this paper, we used 3 cores and experiments ran under 24 hours.
 
 ## Set-up (<5 min)
-In a virtual environment (we recommend using conda or miniconda e.g. by ```conda create -n adapvenv python=3.8```), install the python module containing Farmworld, Markov Soccer, and the Multi-Goal experiment. This module is called ```adapenvs``` and can be installed by:
+In your favorite project directory
+```
+git clone git@github.com:kennyderek/adap.git
+cd adap
+```
+
+Then, in a virtual environment (we recommend using conda or miniconda e.g. by ```conda create -n adapvenv python=3.8```), install the python module containing environment code (for Farmworld, Gym wrappers, etc.). This module is called ```adapenvs``` and can be installed by:
 ```
 cd adaptation_envs
 pip install -e .
 ```
-Now, we can install the python module containing the ADAP policy code (written for RLLib) and contained in the module ```adap```. This will also install dependencies such as pytorch, and ray[rllib].
+Now, we can install the python module containing the ADAP policy code (written for RLLib) and contained in the module ```adap```. This will also install dependencies such as pytorch, tensorflow, and ray[rllib].
 ```
 cd ..
 cd adap_policies
@@ -25,7 +31,7 @@ python run.py --conf ../configs/cartpole/train/adap.yaml --exp-name cartpole
 ```
 ```cartpole/adap.yaml``` is just one possible configuration file, with information regarding the 1) training environment and 2) algorithm hyperparameters. Feel free to make new configuration files by modifying hyperparameters as you wish! Automatically, RLLib will start training and checkpointing the experiment in the directory ```~ray_results/cartpole/[CONFIG_FILE + TIME]```. By default, this will checkpoint the code every 100 epochs, and at the end of training.
 
-### Visualizing Traing Results
+### Visualizing Training Results
 
 Make sure you are using your virtual env, and that it has the installed ADAP python modules. Visualization should cause a PyGlet window to pop up, and render CartPole. 
 
@@ -44,11 +50,11 @@ What if we want to search for ADAP policies (via latent distribution optimizatio
 ```
 python run.py --conf ../configs/cartpole/train/adap.yaml --restore ~/ray_results/cartpole/[CONFIG_FILE + TIME]/checkpoint_000025/checkpoint-25 --evaluate ../configs/cartpole/ablations/move_right.yaml --evolve
 ```
-The ```--evaluate``` argument specifies a new environment configuration to use, which replaces the training environment configuration. Here, we have provided ```move_right.yaml```, which modifies the reward function to be r(t) = -x-axis position of the cartpole. The ```--evolve``` flag tells ```run.py``` to 
+The ```--evaluate``` argument specifies a new environment configuration to use, which replaces the training environment configuration. Here, we have provided ```move_right.yaml```, which modifies the reward function to be r(t) = -x-axis position of the cartpole. The ```--evolve``` flag tells ```run.py``` to run latent optimization on the new environment dynamics.
 
 For CartPole, we optimize the latent space for 30 steps, which is enough to recover policies from our policy space that can move left, or right, consistently. 
 
-Awesome work! You've completed training and latent optimization of a policy space for CartPole!
+Awesome work! You've completed training and latent optimization of a policy space for CartPole! If you'd like, try out getting the CartPole to move on the left side of the screen, with ```move_left.yaml```.
 
 ## FAQs
 

diff --git a/adap_policies/adap.egg-info/requires.txt b/adap_policies/adap.egg-info/requires.txt
@@ -4,3 +4,4 @@ ray[rllib]
 adapenvs
 torch
 tensorflow
+pyglet
diff --git a/adap_policies/setup.py b/adap_policies/setup.py
@@ -2,5 +2,5 @@
 
 setup(name='adap',
       version='0.0.2',
-      install_requires=['gym', 'ray', 'ray[rllib]', 'adapenvs', 'torch', 'tensorflow'] #And any other dependencies required
+      install_requires=['gym', 'ray', 'ray[rllib]', 'adapenvs', 'torch', 'tensorflow', 'pyglet'] #And any other dependencies required
 )
diff --git a/scripts/run.py b/scripts/run.py
@@ -2,8 +2,6 @@
 import argparse
 import yaml
 
-from ray import tune
-
 from common import get_env_and_callbacks, get_name_creator, get_trainer, build_trainer_config, get_name_creator
 
 import copy
@@ -13,12 +11,12 @@
 parser.add_argument('--exp-name', type=str, default="context_exp")
 parser.add_argument('--local-dir', type=str, default="~/ray_results")
 
-parser.add_argument('--restore', type=str, default="") # path to restore the game
-parser.add_argument('--evaluate', type=str, default="") # path to restore the game
-parser.add_argument('--evolve', action="store_true") # path to restore the game
+parser.add_argument("--conf", type=str, help="path to the config file containing ADAP hyperparameters and environment settings")
 
-parser.add_argument("--train", action="store_true")
-parser.add_argument("--conf", type=str)
+parser.add_argument('--restore', type=str, default="", help="")
+parser.add_argument('--evaluate', type=str, default="", help="path of the config file on which to evaluate a model")
+parser.add_argument('--evolve', action="store_true", help="whether to perform latent optimization")
+parser.add_argument("--train", action="store_true", help="used to continue training a restored model")
 
 
 if __name__ == "__main__":
@@ -45,10 +43,11 @@
     stop = {
         "timesteps_total": training_conf['timesteps_total'],
         "training_iteration": training_conf['training_iteration'],
-        # "episode_reward_mean": 34 # this would mean 35/40 agents have survived on average, and is probably a good stop condition
     }
 
     if args.restore == "":
+        from ray import tune
+
         tune.run(trainer_cls,
             config=trainer_conf,
             stop=stop,
@@ -59,6 +58,8 @@
             trial_dirname_creator=get_name_creator(path), # the name after ~/ray_results/context_exp
         )
     elif args.train:
+        from ray import tune
+
         # pick up where we left off training, using a checkpoint
         tune.run(trainer_cls,
             config=trainer_conf,
-Original file line number
+Diff line change
@@ Expand Up / @@ -4,3 +4,4 @@ ray[rllib] @@
     adapenvs
     torch
     tensorflow
+    pyglet