From 302191f248aeee886b7f2e1b3848c3c5315e2719 Mon Sep 17 00:00:00 2001 From: Victor Hugo Cadillo Gutierrez <45715531+vcadillog@users.noreply.github.com> Date: Wed, 6 Nov 2019 01:48:40 -0500 Subject: [PATCH] Update README.md --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 9dd1bda..c7ecc44 100644 --- a/README.md +++ b/README.md @@ -42,10 +42,17 @@ The second level of the second world ``` To change the enviroments, modify the Enviroments.py file. + Eight actors were trained in the first level of Mario, and this is how it learned to finish it. ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/mario.gif) +A plot how the average reward evolved vs the time steps, the model trained in four steps due connection, the reward isn't the same as the raw output of Kautenja's implementation, it was previously scaled for this model, all the data pre processing is in the Datapreprocessing.py file. + +![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/log1.PNG)![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/log2.PNG)![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/log3.PNG)![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/log4.PNG) + +In the logs directory you can find two more plots, for average X_position and Max_X_position. + Testing in not observed enviroments: ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/test_2.gif) ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/test_3.gif) ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/test_4.gif)