Simple implementation PPO, that fixes a few errors. Now runs on pytorch 1.5.0 Credit: Code modified from sweetice 's original version