Mirror Descent Policy Optimization
reinforcement-learning
deep-learning
deep-reinforcement-learning
deep-learning-algorithms
sac
trpo
deep-rl
ppo
deep-learning-ai
policy-optimization
stable-baselines
model-free-rl
mirror-descent
mdpo
-
Updated
Oct 31, 2020 - Python