To adapt and win different environments, the agents ought to be versatile.
*The bold models solved the environment while the italic ones did not.
- (dq_dnn.py) Double Q-learning with a Deep Q-Network using Boltzmann Q-Policy with a large Experience Replay.
- (dq_dnn.py) Double Q-learning with a Deep Q-Network using Decaying Epsilon Q-Policy with a large Experience Replay.
- (q_nn.py) Q-learning with a neural network using Epsilon Q-Policy.
- (q_table.py) Q-learning with a table using Epsilon Q-Policy (other policies available in the code).
- (dq_dnn.py) Double Q-learning with a Deep Q-Network using Decaying Epsilon Q-Policy with a large Experience Replay.