Bipedal Walker PPO Agent

This project implements a reinforcement learning agent trained to walk using Proximal Policy Optimization (PPO) in the BipedalWalker-v3 environment from OpenAI Gym. The agent learns bipedal locomotion from scratch through interaction with a continuous control environment, using reward signals to guide policy updates.

Objective

To train a simulated bipedal robot to walk autonomously using deep reinforcement learning — without imitation learning, predefined motion policies, or manual control logic.

Overview

Environment: OpenAI Gym BipedalWalker-v3
Algorithm: PPO (Proximal Policy Optimization)
Framework: Stable-Baselines3 (PyTorch)
Observation Space: 24 continuous inputs
- Includes hull angle/velocity, joint angles, ground contacts, lidar-based terrain sensing
Action Space: 4-dimensional continuous vector (torques to hips and knees)
Training: 1M+ timesteps using a reward function encouraging forward motion and penalizing inefficient torque usage

Dependencies

pip install -r requirements.txt

Training

python train.py

Testing

To run a trained agent and visualize walking behavior:

python test.py

Demo

[

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
walker-models		walker-models
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
launch.py		launch.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bipedal Walker PPO Agent

Objective

Overview

Dependencies

Training

Testing

Demo

About

Uh oh!

Releases

Packages

Languages

TheDataDesk/biped-robot-ppo

Folders and files

Latest commit

History

Repository files navigation

Bipedal Walker PPO Agent

Objective

Overview

Dependencies

Training

Testing

Demo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages