In today's lecture we enter new territory...
A territory where function approximation (aka supervised machine learning) meets good old Reinforcement Learning.
And this is how Deep RL is born.
We will solve the Cart Pole environment of OpenAI using parametric Q-learning.
Today's lesson is split into 3 parts.
π 1. Parametric Q learning
π 2. Deep Q learning
π 3. Hyperparameter search
Make sure you have Python >= 3.7. Otherwise, update it.
-
Pull the code from GitHub and cd into the
01_taxi
folder:$ git clone https://github.com/Paulescu/hands-on-rl.git $ cd hands-on-rl/01_taxi
-
Make sure you have the
virtualenv
tool in your Python installation$ pip3 install virtualenv
-
Create a virtual environment and activate it.
$ virtualenv -p python3 venv $ source venv/bin/activate
From this point onwards commands run inside the virtual environment.
-
Install dependencies and code from
src
folder in editable mode, so you can experiment with the code.$ (venv) pip install -r requirements.txt $ (venv) export PYTHONPATH="."
-
Open the notebooks, either with good old Jupyter or Jupyter lab
$ (venv) jupyter notebook
$ (venv) jupyter lab
If both launch commands fail, try these:
$ (venv) jupyter notebook --NotebookApp.use_redirect_file=False
$ (venv) jupyter lab --NotebookApp.use_redirect_file=False
-
Play and learn. And do the homework π.
Parametric Q-learning
- Explore the environment
- Random agent baseline
- Linear Q agent with bad hyper-parameters
- Linear Q agent with good hyper-parameters
- Homework
Deep Q-learning
- Crash course on neural networks
- Deep Q agent with bad hyper-parameters
- Deep Q agent with good hyper-parameters
- Homework
Hyperparameter search
Do you wanna become a PRO in Machine Learning?
ππ½ Subscribe to the datamachines newsletter.
ππ½ Follow me on Medium.