From 90d8ed7e374f9be56c7193bde7e2bd29b9042e0b Mon Sep 17 00:00:00 2001 From: Victor Hugo Cadillo Gutierrez <45715531+vcadillog@users.noreply.github.com> Date: Wed, 6 Nov 2019 09:33:05 -0500 Subject: [PATCH] Update README.md --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 4af2953..a0053a9 100644 --- a/README.md +++ b/README.md @@ -62,9 +62,9 @@ Testing in not observed enviroments: ### About the files of the repository: * The Main.py file contains the train and test functions for the model. - 1. The train function saves the weights of the model every 1000 timesteps, also creates summary files to visualize the change of the average total reward, the average of the x position and the max value of x position. + 1. The train function saves the weights of the model every 1000 timesteps, also creates summary files to visualize the change of the average total reward, the average of the x position and the max value of x position. The load of weights is True by default. - 2. The test function loads the weights of the model and test in the selected levels with deterministic actions, the train do stochastic actions to avoid reaching a local optimal; and creates in MP4 videos of how the agent did as many of defined numbers of test was selected. + 2. The test function loads the weights of the model and test in the selected levels with deterministic actions, the train do stochastic actions to encourage to the agent to explore and avoid getting stucked in a local optimal; and creates in MP4 videos of how the agent did as many of defined numbers of test was selected. * The Common_constants.py file contains all the parameters needed for tune the algorithm, it transfer the parameters across the other files, also calls the Enviroment.py file to create the enviroment. @@ -122,4 +122,4 @@ Testing in not observed enviroments: https://github.com/Kautenja/gym-super-mario-bros ### What to do now? -* Implement meta learning and train in multiple enviroments for a more generalized actor. +* Implement joint PPO and train in multiple enviroments for a more generalized actor.