RL algorithm implementations Clean numpy implementations of the RL algorithms presented in Sutton+Barto's Reinforcement Learning textbook. As they get more involved, I'll use jax. Feedback welcome: find me on vantech discord.