-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Split docs into user and development guide #1377
Conversation
@@ -1,7 +1,7 @@ | |||
Ray.tune: Efficient distributed hyperparameter search | |||
===================================================== | |||
Ray.tune: Hyperparameter Optimization Framework |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt like there was an overload of adjectives on the docs page, so cleaned this up a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions on description welcome...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good; it definitely declutters the sidebar, and we can work on the teasers and stuff once we have closer integration with the autoscaler and such.
Merged build finished. Test PASSed. |
Test PASSed. |
This looks decent from a first glance. Perhaps it makes sense to ask around for feedback from people who aren't too familiar with the codebase 🙂 |
Also looks reasonable to me. |
This looks good! The thing that we could provide on top of this that might make it much easier for people to get started implementing algorithms with this is a tutorial style very simple variant of one of the algorithms, like vanilla policy gradients, implemented in the framework. |
@pcmoritz that would also be useful to have for tutorials or bootcamps. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
doc/source/rllib-dev.rst
Outdated
|
||
.. note:: | ||
|
||
If you want to apply existing RLlib algorithms, first check out the `user docs <http://ray.readthedocs.io/en/latest/rllib.html>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This guide will take you through steps for implementing a new algorithm in RLlib. To apply existing algorithms already implemented in RLlib, please see the user docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/source/rllib-dev.rst
Outdated
1. Define an algorithm-specific `Evaluator class <#evaluators-and-optimizers>`__ (the core of the algorithm). Evaluators encapsulate framework-specific components such as the policy and loss functions. For an example, see the `A3C Evaluator implementation <https://github.com/ray-project/ray/blob/master/python/ray/rllib/a3c/a3c_evaluator.py>`__. | ||
|
||
|
||
2. Pick an appropriate `RLlib optimizer class <#evaluators-and-optimizers>`__. Optimizers manage the parallel execution of the algorithm. RLlib provides several built-in optimizers for gradient-based algorithms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Advanced algorithms may find it beneficial to implement their own optimizers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build finished. Test FAILed. |
Test FAILed. |
- Support for multiple frameworks (TensorFlow, PyTorch) | ||
- Scalable primitives for developing new algorithms | ||
- Shared models between algorithms | ||
|
||
You can find the code for RLlib `here on GitHub <https://github.com/ray-project/ray/tree/master/python/ray/rllib>`__, and the NIPS symposium paper `here <https://drive.google.com/open?id=1lDMOFLMUQXn8qGtuahOBUwjmFb2iASxu>`__. | ||
You can find the code for RLlib `here on GitHub <https://github.com/ray-project/ray/tree/master/python/ray/rllib>`__, and the NIPS symposium paper `here <https://arxiv.org/abs/1712.09381>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIPS 2017 Deep Reinforcement Learning Symposium
would be a better reference
What do these changes do?
Splits out the internal dev docs for RLlib, and add a high level overview on how to define new algorithms and models.
@richardliaw @pcmoritz any suggestions? I didn't add too much of the specifics on sampling / other utils classes since they still seem to be in flux.