Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify CleanRL is a non-modular library #200

Merged
merged 4 commits into from
Jun 18, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Clarify CleanRL is a non-modular library
  • Loading branch information
vwxyzjn committed Jun 10, 2022
commit b0d00df8b926617651638923e622b51c0b477305
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ CleanRL is a Deep Reinforcement Learning library that provides high-quality sing


* 📜 Single-file implementation
* *Every detail about an algorithm is put into the algorithm's own file.* It is therefore easier to fully understand an algorithm and do research with.
* *Every detail about an algorithm variant is put into a single standalone file.*
* For example, our `ppo_atari.py` only has 340 lines of code but contains all implementation details on how PPO works with Atari games, so it is a great reference implementation to read for folks who do not wish to read an entire modular library.
vwxyzjn marked this conversation as resolved.
Show resolved Hide resolved
* 📊 Benchmarked Implementation (7+ algorithms and 34+ games at https://benchmark.cleanrl.dev)
* 📈 Tensorboard Logging
* 🪛 Local Reproducibility via Seeding
Expand All @@ -28,6 +29,8 @@ You can read more about CleanRL in our [technical paper](https://arxiv.org/abs/2

Good luck have fun :rocket:

⚠️ **NOTE**: CleanRL is *not* a modular library and therefore it is not meant to be imported. At the cost of duplicate code, we make all implementation details of a DRL algorithm variant easy to understand, so CleanRL comes with its own pros and cons. You should consider using CleanRL if you want to 1) understand all implementation details of an algorithm's varaint or 2) do quick prototypes.

## Get started

Prerequisites:
Expand Down
22 changes: 12 additions & 10 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,21 @@

CleanRL is a Deep Reinforcement Learning library that provides high-quality single-file implementation with research-friendly features. The implementation is clean and simple, yet we can scale it to run thousands of experiments using AWS Batch. The highlight features of CleanRL are:


* Single-file Implementation
* **Every detail about an algorithm is put into the algorithm's own file.** Therefore, it's easier for you to fully understand an algorithm and do research with it.
* Benchmarked Implementation on 7+ algorithms and 34+ games
* Tensorboard Logging
* Local Reproducibility via Seeding
* Videos of Gameplay Capturing
* Experiment Management with [Weights and Biases](https://wandb.ai/site)
* Cloud Integration with Docker and AWS
* 📜 Single-file implementation
* *Every detail about an algorithm variant is put into a single standalone file.*
* For example, our `ppo_atari.py` only has 340 lines of code but contains all implementation details on how PPO works with Atari games, so it is a great reference implementation to read for folks who do not wish to read an entire modular library.
* 📊 Benchmarked Implementation (7+ algorithms and 34+ games at https://benchmark.cleanrl.dev)
* 📈 Tensorboard Logging
* 🪛 Local Reproducibility via Seeding
* 🎮 Videos of Gameplay Capturing
* 🧫 Experiment Management with [Weights and Biases](https://wandb.ai/site)
* 💸 Cloud Integration with docker and AWS

You can read more about CleanRL in our [technical paper](https://arxiv.org/abs/2111.08819) and [documentation](https://docs.cleanrl.dev/).

Good luck have fun 🚀
Good luck have fun :rocket:

⚠️ **NOTE**: CleanRL is *not* a modular library and therefore it is not meant to be imported. At the cost of duplicate code, we make all implementation details of a DRL algorithm variant easy to understand, so CleanRL comes with its own pros and cons. You should consider using CleanRL if you want to 1) understand all implementation details of an algorithm's varaint or 2) do quick prototypes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

varaint -> variant

I'm not sure that "do quick prototypes" makes sense here. Running from clearrl import PPO would be quick. Reading the algorithm and copy-pasting it into my code is slow.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "doing prototypes" is not really a well-defined notion as there are many types of prototypes. Being able to do prototypes quickly largely depends on the use case.

While things like from stable_baselines3 import PPO is quick but if you want to prototype advanced features that SB3 does not support, it could be more difficult as discussed in #197 with the invalid action masking example.

Maybe I can clarify as "do prototypes that can't be achieved by just combining components in modular DRL libraries"? I am really unsure what the phrasing would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if you want to prototype advanced features that SB3 does not support" Keep in mind that 95% of people just want something that works, not advanced features. But in any case, this PR is good and I think you should make any changes that result in it being merged.


## Citing CleanRL

Expand Down