Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib] Split docs into user and development guide #1377

Merged
merged 12 commits into from
Jan 1, 2018

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Dec 30, 2017

What do these changes do?

Splits out the internal dev docs for RLlib, and add a high level overview on how to define new algorithms and models.

@richardliaw @pcmoritz any suggestions? I didn't add too much of the specifics on sampling / other utils classes since they still seem to be in flux.

@@ -1,7 +1,7 @@
Ray.tune: Efficient distributed hyperparameter search
=====================================================
Ray.tune: Hyperparameter Optimization Framework
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt like there was an overload of adjectives on the docs page, so cleaned this up a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions on description welcome...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good; it definitely declutters the sidebar, and we can work on the teasers and stuff once we have closer integration with the autoscaler and such.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3013/
Test PASSed.

@richardliaw
Copy link
Contributor

This looks decent from a first glance. Perhaps it makes sense to ask around for feedback from people who aren't too familiar with the codebase 🙂

@robertnishihara
Copy link
Collaborator

Also looks reasonable to me.

@pcmoritz
Copy link
Contributor

pcmoritz commented Dec 30, 2017

This looks good! The thing that we could provide on top of this that might make it much easier for people to get started implementing algorithms with this is a tutorial style very simple variant of one of the algorithms, like vanilla policy gradients, implemented in the framework.

@robertnishihara
Copy link
Collaborator

@pcmoritz that would also be useful to have for tutorials or bootcamps.

@ericl
Copy link
Contributor Author

ericl commented Dec 30, 2017

@pcmoritz I agree! I filed #1378, but we could also do it for whatever comes up next.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3019/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3030/
Test PASSed.


.. note::

If you want to apply existing RLlib algorithms, first check out the `user docs <http://ray.readthedocs.io/en/latest/rllib.html>`__.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guide will take you through steps for implementing a new algorithm in RLlib. To apply existing algorithms already implemented in RLlib, please see the user docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

1. Define an algorithm-specific `Evaluator class <#evaluators-and-optimizers>`__ (the core of the algorithm). Evaluators encapsulate framework-specific components such as the policy and loss functions. For an example, see the `A3C Evaluator implementation <https://github.com/ray-project/ray/blob/master/python/ray/rllib/a3c/a3c_evaluator.py>`__.


2. Pick an appropriate `RLlib optimizer class <#evaluators-and-optimizers>`__. Optimizers manage the parallel execution of the algorithm. RLlib provides several built-in optimizers for gradient-based algorithms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Advanced algorithms may find it beneficial to implement their own optimizers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3052/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3053/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3054/
Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3055/
Test FAILed.

- Support for multiple frameworks (TensorFlow, PyTorch)
- Scalable primitives for developing new algorithms
- Shared models between algorithms

You can find the code for RLlib `here on GitHub <https://github.com/ray-project/ray/tree/master/python/ray/rllib>`__, and the NIPS symposium paper `here <https://drive.google.com/open?id=1lDMOFLMUQXn8qGtuahOBUwjmFb2iASxu>`__.
You can find the code for RLlib `here on GitHub <https://github.com/ray-project/ray/tree/master/python/ray/rllib>`__, and the NIPS symposium paper `here <https://arxiv.org/abs/1712.09381>`__.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIPS 2017 Deep Reinforcement Learning Symposium would be a better reference

@ericl ericl merged commit 6e6674a into ray-project:master Jan 1, 2018
@robertnishihara robertnishihara deleted the alg-recipe branch January 1, 2018 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants