In this Chapter you will learn more in details about MDPs and Bellman Equations.
Examples of this chapter are:
- Student MDP: link
In this example you will practice with MDPs, you will learn how to calculate the value function for a given policy and how to calculate the action value function.
Here you can find all exercises of this chapter:
In these exercises you will practice with Markov Reward Processes (MRPs) and with the Linear Programming solution of Markov Decision Processes.
The activity of this chapter is:
- Gridworld: link
In this activity you will learn how to formalize a classic RL environment (Gridworld) composed of good states and bad states. The objective is to solve the environment finding the state-value function for all states using Bellman Equations.