Learning to Walk

In this paper we test the state-of-the-art reinforcement learning algorithm Proximal Policy Optimisation (PPO) in the robotic control domain for their ability to transfer between similar, albeit different, tasks. We will use OpenAI’s Bipedal walker’s two environments; with the aim to reduce training times and improve performance comparative to training from scratch for both tasks.

Experiments

Our experiments show that weight sharing with all layers transferred can increase the initial level of performance - when transferring to both more complex or simpler tasks. We also show that training on multiple tasks can significantly increase performance on complex environments.

Results

Our results show us that similar methods could be used to prepare an agent for very complex task by training on a simpler task. Thus, speeding up learning of a complex task and increasing the performance level.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Learning to Walk

Experiments

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Learning to Walk

Experiments

Results