-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example: Simple RL example using DQN/Lightning #1232
Conversation
* DQN RL Agent using Lightning * Uses Iterable Dataset for Replay Buffer * Buffer is populated by agent as training is carried out, updating the dataset
Hello @djbyrne! Thanks for updating this PR.
Comment last updated at 2020-03-28 09:34:00 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
- add note to changelog
- add docstring with some decribtion
pl_examples/domain_templates/dqn.py
Outdated
class DQN(nn.Module): | ||
""" Simple MLP network""" | ||
|
||
def __init__(self, obs_size, n_actions, hidden_size=128): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
simplify get_device method Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
Re-ordered imports Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* CI: split tests-examples * tests without template * comment depends * CircleCI typo * add doctest * update test req. * CI tests * setup macOS * longer train * lover pred acc * fix model * rename default model * lower tests acc * typo * imports * fix test optimizer * update calls * fix Win * lower Drone image * fix call * pytorch image * fix test * add dev image * add dev image * update image * drone volume * lint * update test notes * rename tests/models >> tests/base * group models * conftest * optim imports * typos * fix import * fix tests * install AMP * tests * fix import
merged #990 |
@williamFalcon how do I go about adding the example to the colab notebook? |
We had a discussion with @ethanwharris and @MattPainter01 some time ago and we agreed to have it as a notebook in this repo which is connected Collab on request and also used as an example in Docs, right? |
The circleci tests seem to be failing due to using typing OrderedDict. Is there any reason why this should be failing? |
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
@djbyrne could you rebase master, seems that you are missing the recent test/example split |
…ng-AI#1229) * Fix requirement-extra use released Trains package * Update README.md add Trains and links to the external Visualization section Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
…code (Lightning-AI#1240) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
* system info * update big info * test script * update config * rename script * import path
…etween training / eval (Lightning-AI#1194)
* DQN RL Agent using Lightning * Uses Iterable Dataset for Replay Buffer * Buffer is populated by agent as training is carried out, updating the dataset
simplify get_device method Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
Re-ordered imports Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
Codecov Report
@@ Coverage Diff @@
## master #1232 +/- ##
======================================
+ Coverage 91% 92% +1%
======================================
Files 61 61
Lines 3121 3153 +32
======================================
+ Hits 2833 2886 +53
+ Misses 288 267 -21 |
I rebased with master, but it seems that the ubuntu and osx tests fails when uploading the pytest results. Any ideas why? |
@djbyrne just restarted jobs. working to merge this ASAP :) |
@djbyrne pls next time did rebase, now it seems like you did merge since it shows 67 changed files... |
* Example: Simple RL example using DQN/Lightning * DQN RL Agent using Lightning * Uses Iterable Dataset for Replay Buffer * Buffer is populated by agent as training is carried out, updating the dataset * Applied autopep8 fixes * * Updated line length from 120 to 110 * Update pl_examples/domain_templates/dqn.py simplify get_device method Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pl_examples/domain_templates/dqn.py Re-ordered imports Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * CI: split tests-examples (Lightning-AI#990) * CI: split tests-examples * tests without template * comment depends * CircleCI typo * add doctest * update test req. * CI tests * setup macOS * longer train * lover pred acc * fix model * rename default model * lower tests acc * typo * imports * fix test optimizer * update calls * fix Win * lower Drone image * fix call * pytorch image * fix test * add dev image * add dev image * update image * drone volume * lint * update test notes * rename tests/models >> tests/base * group models * conftest * optim imports * typos * fix import * fix tests * install AMP * tests * fix import * Clean up * added module docstring * renamed variables to be more descriptive * Added missing docstrings and type annotations * Added gym to example requirements * Added note to changelog * updated example image * update types * rename script * Update CHANGELOG.md Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * another rename * Disable validation when val_percent_check=0 (Lightning-AI#1251) * fix disable validation * add test * update changelog * update docs for val_percent_check * make "fast training" docs consistent * calling self.forward() -> self() (Lightning-AI#1211) * self.forward() -> self() * update changelog Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Fix requirements-extra.txt Trains package to release version (Lightning-AI#1229) * Fix requirement-extra use released Trains package * Update README.md add Trains and links to the external Visualization section Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Remove unnecessary parameters to super() in documentation and source code (Lightning-AI#1240) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * update deprecation warning (Lightning-AI#1258) * update docs for progress bat values (Lightning-AI#1253) * lower timeouts for inactive issues (Lightning-AI#1250) * update contrib list (Lightning-AI#1241) Co-authored-by: William Falcon <waf2107@columbia.edu> * Fix outdated docs (Lightning-AI#1227) * Fix typo (Lightning-AI#1224) * drop unused Tox (Lightning-AI#1242) * system info (Lightning-AI#1234) * system info * update big info * test script * update config * rename script * import path * Changed smoothing in tqdm to decrease variability of time remaining between training / eval (Lightning-AI#1194) * Example: Simple RL example using DQN/Lightning * DQN RL Agent using Lightning * Uses Iterable Dataset for Replay Buffer * Buffer is populated by agent as training is carried out, updating the dataset * Applied autopep8 fixes * * Updated line length from 120 to 110 * Update pl_examples/domain_templates/dqn.py simplify get_device method Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Update pl_examples/domain_templates/dqn.py Re-ordered imports Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * Clean up * added module docstring * renamed variables to be more descriptive * Added missing docstrings and type annotations * Added gym to example requirements * Added note to changelog * update types * rename script * Update CHANGELOG.md Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com> * another rename Co-authored-by: Donal Byrne <Donal.Byrne@xperi.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com> Co-authored-by: Tyler Yep <tyep@stanford.edu> Co-authored-by: Shunta Komatsu <59395084+skmatz@users.noreply.github.com> Co-authored-by: Jack Pertschuk <jackpertschuk@gmail.com>
DQN RL Agent using Lightning. Model uses an IterableDataset to wrap the ReplayBuffer, providing mini batches of past experiences to train on during each train_step. During each train_step, the agent carries out a step through the environment and updates the ReplayBuffer within the Dataset.
Before submitting
What does this PR do?
Fixes #713
Provides a basic domain example of using Lightning for Reinforcement Learning
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃