This repo contains implementations of algorithms such a Q-learning, SARSA, TD, Policy gradient
-
Updated
Dec 8, 2019 - Python
This repo contains implementations of algorithms such a Q-learning, SARSA, TD, Policy gradient
Notebooks covering temporal difference methods using OpenAI Gym
Topics in Machine Learning @ IIIT Hyderabad (Fall 2021)
Temporal Difference methods - A simple implementation of SARSA algorithm applied to OpenAI gym's "CliffWalking" environment.
Add a description, image, and links to the td-methods topic page so that developers can more easily learn about it.
To associate your repository with the td-methods topic, visit your repo's landing page and select "manage topics."