-
Notifications
You must be signed in to change notification settings - Fork 129
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feature(ekiefl): add pooltool env and related configs (#227)
* Add SumToThree pooltool env * Woops * Update datatypes and add single inference mode * Move core into pooltool * Add some speed and memory profiling for env debug * Trying to get CNNs working * Patch #172 * Setup first experiment * Fix up sumtothreeimage * Update obs space to be float * Move image_representation into fork - It was in pooltool ai-framework branch - By moving it here, main branch of pooltool can be used * Start a README * Begin test suite for sum_to_three_env * Add tests for datatypes * Finish test suite for sum_to_three_env * rename tests -> characterize * Delete * Increase to 300,000 replay buffer * Finish README * Fix image link * Link the discussion page * Update pooltool API calls to 0.3.0 * Switch to dataclasses - attrs is not standard library, best not to impose my standards - Also had some docs * Progress on documentation and variable naming * Finish docs for datatypes.py * Data structure changes - Additionally, move reward function into reward module and add options to select different rewards via cfg * Parameterize action space bounds - Remove clunky class methods * Add a module docstring * Finish docstrings for sum_to_three coordinate environment * rm pooltool __init__.py - LSP was getting confused with the `import pooltool` statement * Add pytest * Add pooltool-billiards * Add docs for reward space * Add tests for grayscale conversion, add docs * Add module doc for reward.py * Add docs for image_representation * Fix image env * Update info about px parameter * Add serialie/deserialize methods for RenderConfig * Three things: - move px to RenderConfig - serialize/deserialization methods for RenderConfig - Mimic the refactor in cts env to the image env * Use channels in renderconfig * Buff image_representation visualization - Add an animation * Start consolidation * More consolidation between observation types * consolidate image and coordinate observation types * Remove old file * Add default config * Single source state setting * Add tests * Unused * Add default render config option - Store as attribute * Add speed test script * Small changes * Add sum to three to feature table * Update pooltool README * Move observation/ and reward.py into utils.py * polish(pu): polish sum_to_three configs * feature(pu): add sum_to_three_vector_obs_sac_config.py and polish related config names * polish(pu): polish sum_to_three configs * polish(pu): polish pooltool configs --------- Co-authored-by: dyyoungg <yangdeyu@sensetime.com> Co-authored-by: 蒲源 <2402552459@qq.com> Co-authored-by: 蒲源 <48008469+puyuan1996@users.noreply.github.com>
- Loading branch information
1 parent
540bdcb
commit 39dfa3c
Showing
28 changed files
with
2,375 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,4 +6,6 @@ bsuite | |
minigrid | ||
moviepy | ||
pycolab | ||
line_profiler | ||
pytest | ||
pooltool-billiards>=0.3.1 | ||
line_profiler |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
# Billiards RL | ||
|
||
Welcome to the documentation for billiards simulation within the LightZero framework. Billiards offers an intriguing learning environment for reinforcement learning due to its continuous action space, turn-based play, and the need for long-term planning and strategy formulation. | ||
|
||
## Pooltool | ||
|
||
Pooltool is a general purpose billiards simulator crafted specifically for science and engineering applications (learn more [here](https://github.com/ekiefl/pooltool)). It has been incorporated into LightZero to create diverse learning environments for billiards games. | ||
|
||
## Testing your installation | ||
|
||
Pooltool comes pre-installed with LightZero. If you are using a custom setup, follow the _pip_ install instructions [here](https://pooltool.readthedocs.io/en/latest/getting_started/install.html#install-option-1-pip). | ||
|
||
Verify pooltool is found in your python path: | ||
|
||
```bash | ||
python -c "import pooltool; print(pooltool.__version__)" | ||
``` | ||
|
||
Further test your installation by opening the interactive interface: | ||
|
||
```bash | ||
# Unix | ||
run_pooltool | ||
|
||
# Windows | ||
run_pooltool.bat | ||
``` | ||
|
||
(For instructions on how to play, check out the [Getting Started tutorial](https://pooltool.readthedocs.io/en/latest/getting_started/interface.html)) | ||
|
||
## Supported Games | ||
|
||
Currently supports the following games: | ||
|
||
1. **Sum to Three**: A simplified billiards game designed to make learning easier for agents. | ||
2. **Standard Billiards Games** (planned for future updates): Including 8-ball, 9-ball, and snooker. | ||
|
||
The rest of the document provides details for each supported game. | ||
|
||
## Game 1: Sum to Three | ||
|
||
Standard billiards games like 8-ball, 9-ball, and snooker have complex rulesets which make learning more difficult. | ||
|
||
In contrast, _sum to three_ is a fictitious billiards game with a simple ruleset. | ||
|
||
### Rules | ||
|
||
1. The game is played on a table with no pockets | ||
1. There are 2 balls: a cue ball and an object ball | ||
1. The player must hit the object ball with the cue ball | ||
1. The player scores a point if the number of times a ball hits a cushion is 3 | ||
1. The player takes 10 shots, and their final score is the number of points they achieve | ||
|
||
For example, this is a successful shot because there are three ball-cushion collisions: | ||
|
||
<img src="../../assets/pooltool/3hits.gif" width="600" /> | ||
|
||
This is an unsuccessful shot because there are four ball-cushion collisions: | ||
|
||
<img src="../../assets/pooltool/4hits.gif" width="600" /> | ||
|
||
### Observation / Action Spaces | ||
|
||
Continuous and discrete observatwon spaces are supported. The continuous observation space uses the coordinates of the two balls as the observation. The discrete observation space is based on configurable image-based feature planes. | ||
|
||
In general, when an agent strikes a cue ball, the cue stick is described by 5 continuous parameters: | ||
|
||
``` | ||
V0 : positive float | ||
What initial velocity does the cue strike the ball? | ||
phi : float (degrees) | ||
The direction you strike the ball | ||
theta : float (degrees) | ||
How elevated is the cue from the playing surface, in degrees? | ||
a : float | ||
How much side english should be put on? -1 being rightmost side of ball, +1 being | ||
leftmost side of ball | ||
b : float | ||
How much vertical english should be put on? -1 being bottom-most side of ball, +1 being | ||
topmost side of ball | ||
``` | ||
|
||
Since sum to three is a simple game, only a reduced action space with 2 parameters is supported: | ||
|
||
1. V0: The speed of the cue stick. Increasing this means the cue ball travels further | ||
1. cut angle: The angle that the cue ball hits the object ball with | ||
|
||
For example, in this shot, the cut angle is -70 (hitting the left side of the object ball): | ||
|
||
<img src="../../assets/pooltool/largecut.gif" width="600" /> | ||
|
||
For example, in this shot, the cut angle is 0 (head-on collision): | ||
|
||
<img src="../../assets/pooltool/nocut.gif" width="600" /> | ||
|
||
Based on the game dimensions, a suitable bound for the action parameters is used: [0.3, 3] for speed and [-70, 70] for cut angle. | ||
|
||
### Experiments | ||
|
||
You can conduct experiments using different observation spaces: | ||
|
||
1. **Continuous Observation Space Experiment**: | ||
- Run the experiment with: | ||
```bash | ||
python ./zoo/pooltool/sum_to_three/config/sum_to_three_config.py | ||
``` | ||
- Results will be saved in `./data_pooltool_sampled_efficientzero/image-obs`. | ||
|
||
2. **Discrete Observation Space Experiment**: | ||
- Run the experiment with: | ||
```bash | ||
python ./zoo/pooltool/sum_to_three/config/sum_to_three_image_config.py | ||
``` | ||
- Modify the feature plane information by editing `./zoo/pooltool/sum_to_three/config/feature_plane_config.json`. View the usage example in `./zoo/pooltool/image_representation.py` for details about the feature plane content. | ||
- Results will be saved in `./data_pooltool_sampled_efficientzero/vector-obs`. | ||
|
||
### Results | ||
|
||
TODO(puyuan1996) | ||
|
||
## Game 2: 8-ball / 9-ball / 3-cushion / snooker | ||
|
||
What billiards game would you like to see next? |
Oops, something went wrong.