Skip to content

Commit

Permalink
[rllib] Fix docs to reference new code locations (ray-project#1092)
Browse files Browse the repository at this point in the history
* fix rllib docs

* Update example-a3c.rst
  • Loading branch information
ericl authored and pcmoritz committed Oct 10, 2017
1 parent a52a1e8 commit 90013ed
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 15 deletions.
6 changes: 3 additions & 3 deletions doc/source/example-a3c.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ You can run the code with

.. code-block:: bash
python/ray/rllib/a3c/example.py --num-workers=N
python/ray/rllib/train.py --env=Pong-ram-v4 --alg=A3C --config='{"num_workers": N}'
Reinforcement Learning
----------------------
Expand Down Expand Up @@ -115,7 +115,7 @@ global model parameters. The main training script looks like the following.
import numpy as np
import ray
def train(num_workers, env_name="PongDeterministic-v3"):
def train(num_workers, env_name="PongDeterministic-v4"):
# Setup a copy of the environment
# Instantiate a copy of the policy - mainly used as a placeholder
env = create_env(env_name, None, None)
Expand Down Expand Up @@ -147,7 +147,7 @@ global model parameters. The main training script looks like the following.
Benchmarks and Visualization
----------------------------

For the :code:`PongDeterministic-v3` and an Amazon EC2 m4.16xlarge instance, we
For the :code:`PongDeterministic-v4` and an Amazon EC2 m4.16xlarge instance, we
are able to train the agent with 16 workers in around 15 minutes. With 8
workers, we can train the agent in around 25 minutes.

Expand Down
4 changes: 2 additions & 2 deletions doc/source/example-evolution-strategies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ To run the application, first install some dependencies.
You can view the `code for this example`_.

.. _`code for this example`: https://github.com/ray-project/ray/tree/master/python/ray/rllib/evolution_strategies
.. _`code for this example`: https://github.com/ray-project/ray/tree/master/python/ray/rllib/es

The script can be run as follows. Note that the configuration is tuned to work
on the ``Humanoid-v1`` gym environment.

.. code-block:: bash
python/ray/rllib/evolution_strategies/example.py
python/ray/rllib/train.py --env=Humanoid-v1 --alg=ES
At the heart of this example, we define a ``Worker`` class. These workers have
a method ``do_rollouts``, which will be used to perform simulate randomly
Expand Down
13 changes: 3 additions & 10 deletions doc/source/example-policy-gradient.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,13 @@ least version ``1.0.0``) and a few other dependencies.
pip install gym[atari]
pip install tensorflow
Then install the package as follows.

.. code-block:: bash
cd ray/examples/policy_gradient/
python setup.py install
Then you can run the example as follows.

.. code-block:: bash
python/ray/rllib/policy_gradient/example.py --environment=Pong-ram-v3
python/ray/rllib/train.py --env=Pong-ram-v4 --alg=PPO
This will train an agent on the ``Pong-ram-v3`` Atari environment. You can also
This will train an agent on the ``Pong-ram-v4`` Atari environment. You can also
try passing in the ``Pong-v0`` environment or the ``CartPole-v0`` environment.
If you wish to use a different environment, you will need to change a few lines
in ``example.py``.
Expand All @@ -41,4 +34,4 @@ Many of the TensorBoard metrics are also printed to the console, but you might
find it easier to visualize and compare between runs using the TensorBoard UI.

.. _`TensorFlow with GPU support`: https://www.tensorflow.org/install/
.. _`code for this example`: https://github.com/ray-project/ray/tree/master/python/ray/rllib/policy_gradient
.. _`code for this example`: https://github.com/ray-project/ray/tree/master/python/ray/rllib/ppo

0 comments on commit 90013ed

Please sign in to comment.