This Repository contains codes related to FL-RL Project. The complete codes and notebooks will be published soon.
The purpose of this project is to implement and test the PyTorch codes of Meta reinforcement Learning approaches from scratch.
- Python 3.8
- PyTorch
- Gym v0.21.0
- Learn2Learn v0.2.0
- Cherry-RL v0.2.0
- Mojuco-Py
- Trust Region Policy Optimization
- MAML A2C
- MAML TRPO
- MAML PPO
- ANIL
- SNAIL
- Reptile
Training for 300 iterations.
- TRPO-MAML trained model output
- After 3 TRPO Update steps on forward task
- After 3 TRPO Update steps on backward task
Training for 900 iterations.
- TRPO-MAML trained model output
- After 5 TRPO Update steps on forward task
- After 5 TRPO Update steps on backward task
- Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. 34th Int. Conf. Mach. Learn. ICML 2017 3, 1856–1868 (2017).
- Raghu, A., Raghu, M., Bengio, S. & Vinyals, O. Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML. 1–21 (2019).
- Mishra, N., Rohaninejad, M., Chen, X. & Abbeel, P. A Simple Neural Attentive Meta-Learner. 6th Int. Conf. Learn. Represent. ICLR 2018 - Conf. Track Proc. 1–17 (2017).
- Schulman, J., Levine, S., Moritz, P., Jordan, M. I. & Abbeel, P. Trust Region Policy Optimization. 32nd Int. Conf. Mach. Learn. ICML 2015 3, 1889–1897 (2015).
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal Policy Optimization Algorithms. 1–12 (2017).