PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research papers.
-
Updated
Aug 14, 2025 - Python
PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research papers.
TraderNet-CRv2 - Combining Deep Reinforcement Learning with Technical Analysis and Trend Monitoring on Cryptocurrency Markets
💡 Grasp - Pick-and-place with a robotic hand 👨🏻💻
A Complete Collection of Deep RL Famous Algorithms implemented in Gymnasium most Popular environments
End-to-end RL trading framework with PPO agent, self-attention neural network, custom Gym environment, and advanced backtesting.
LibreGrabbe 16-DOF Robot Hand
Comparative research platform for Deep Reinforcement Learning and heuristic controllers in autonomous racing. Benchmarks DRL (PPO) agents against deterministic baselines in Unity MiniKart, with full reproducibility, human-like evaluation, and performance logs.
This repository implements a Proximal Policy Optimization (PPO) agent that learns to play Super Mario Bros using TensorFlow/Keras and OpenAI Gym. Features CNNs for vision, Actor-Critic architecture, and parallel environments. Train your own Mario master or run a pre-trained one!
This is a project for PPO S&P 500 trading
Reinforcement learning–based controller for balancing an inverted pendulum using Proximal Policy Optimization (PPO). Supports configurable mass, length, and gravity settings (Earth, lunar, microgravity) with automated training logs, reward visualization, and performance analysis.
An RL based model using PPO algorithm leveraging OpenAI Gym environment to play the popular Super Mario game.
Developed-an-AWS-DeepRacer-model-using-Python-&-the-PPO-algorithm,-leveraging-TensorFlow-to-train-&-fine-tune-a-deep-reinforcement-learning-model.-Designed-a-custom-reward-function-&-optimized-hyperparameters-to-improve-policy-learning-&-navigation-performance.-Utilized-AWS-infrastructure-for-scalable-training-&-deployment.
AI-powered production line optimization using reinforcement learning (PPO).
Training a lunar lander to land using the OpenAI "gym" library and Stable Baselines3 "PPO" reinforcement learning algorithm
How close can LoRA get to full fine-tuning (FullFT) in terms of learning speed, performance, and compute tradeoffs? And under what conditions?
This Legal Document Analyzer is a proof-of-concept NLP project demonstrating the potential of transformers for legal document summarization.
This repository explores Reinforcement Learning (RL) through hands-on implementations of key algorithms and environments. It demonstrates how agents learn by interacting with environments, optimizing rewards, and adapting to tasks ranging from Atari games to autonomous driving and custom simulations.
Stable Baselines3
2D orbital rocket sim with PPO in PyTorch. Models thrust, drag, gravity, fuel; agent learns efficient ascent. Includes telemetry & visualization
Add a description, image, and links to the ppo-algorithm topic page so that developers can more easily learn about it.
To associate your repository with the ppo-algorithm topic, visit your repo's landing page and select "manage topics."