File tree Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Original file line number Diff line number Diff line change @@ -19,8 +19,8 @@ obstructs the path to the goal._
19
19
20
20
To see this in action, observe the two learning curves below. Each displays the reward
21
21
over time for an agent trained using PPO with the same set of training hyperparameters.
22
- The difference is that the agent on the left was trained using the full-height wall
23
- version of the task, and the right agent was trained using the curriculum version of
22
+ The difference is that one agent was trained using the full-height wall
23
+ version of the task, and the other agent was trained using the curriculum version of
24
24
the task. As you can see, without using curriculum learning the agent has a lot of
25
25
difficulty. We think that by using well-crafted curricula, agents trained using
26
26
reinforcement learning will be able to accomplish tasks otherwise much more difficult.
You can’t perform that action at this time.
0 commit comments