Scheduling Parameters with ReLax (TRPO step KL divergence)
This repository contains a demonstration of scheduling possibilities in ReLax (TRPO step KL divergence). Plot below shows a theoretical (scheduled) step KL-divergence versus an actual (derived with estimating Fisher vector product) for TRPO-GAE algorithm. This schedule is sub-optimal in terms of training performance and built for demonstration purposes only.