Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 932 Bytes

README.md

File metadata and controls

11 lines (6 loc) · 932 Bytes

Learning to Walk

In this paper we test the state-of-the-art reinforcement learning algorithm Proximal Policy Optimisation (PPO) in the robotic control domain for their ability to transfer between similar, albeit different, tasks. We will use OpenAI’s Bipedal walker’s two environments; with the aim to reduce training times and improve performance comparative to training from scratch for both tasks.

Experiments

Our experiments show that weight sharing with all layers transferred can increase the initial level of performance - when transferring to both more complex or simpler tasks. We also show that training on multiple tasks can significantly increase performance on complex environments.

Results

Our results show us that similar methods could be used to prepare an agent for very complex task by training on a simpler task. Thus, speeding up learning of a complex task and increasing the performance level.