From 4d776f8efba2d12d7e4c1a3752d469d8eda06b52 Mon Sep 17 00:00:00 2001 From: Victor Hugo Cadillo Gutierrez <45715531+vcadillog@users.noreply.github.com> Date: Wed, 6 Nov 2019 01:56:51 -0500 Subject: [PATCH] Update README.md --- README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index c7ecc44..49b1517 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,7 @@ Testing in not observed enviroments: ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/test_2.gif) ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/test_3.gif) ![alt text](https://github.com/vcadillog/PPO-Mario-Bros-Tensorflow-2/blob/master/images/test_4.gif) ### This code was inspired from: + * [1] Proximal Policy Optimization Algorithms. https://arxiv.org/pdf/1707.06347.pdf @@ -66,7 +67,7 @@ Testing in not observed enviroments: https://arxiv.org/pdf/1804.03720.pdf -* [3] The implementation in tensorflow 1 of "coreystaten". +* [3] The implementation of Ping Pong - Atari in tensorflow 1 of "coreystaten". https://github.com/coreystaten/deeprl-ppo @@ -74,7 +75,7 @@ Testing in not observed enviroments: https://github.com/jakegrigsby/supersonic/tree/master/supersonic -* [5] OpenAI Baselines of Atari and Retro wrappers. +* [5] OpenAI Baselines of Atari and Retro wrappers for pre processing. https://github.com/openai/baselines/tree/master/baselines @@ -82,5 +83,5 @@ Testing in not observed enviroments: https://github.com/Kautenja/gym-super-mario-bros -### To do: +### What to do now? * Implement meta learning and train in multiple enviroments for a more generalized actor.