Skip to content

Commit a083a15

Browse files
author
Chris Elion
authored
remove Use Heuristic from docs (#3568)
1 parent e882293 commit a083a15

5 files changed

+13
-11
lines changed

docs/Getting-Started-with-Balance-Ball.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ The Ball3DAgent subclass defines the following methods:
8484
negative reward for dropping the ball. An Agent is also marked as done when it
8585
drops the ball so that it will reset with a new ball for the next simulation
8686
step.
87-
* agent.Heuristic() - When the `Use Heuristic` checkbox is checked in the Behavior
87+
* agent.Heuristic() - When the `Behavior Type` is set to `Heuristic Only` in the Behavior
8888
Parameters of the Agent, the Agent will use the `Heuristic()` method to generate
8989
the actions of the Agent. As such, the `Heuristic()` method returns an array of
9090
floats. In the case of the Ball 3D Agent, the `Heuristic()` method converts the

docs/Learning-Environment-Best-Practices.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,9 @@
88
lessons which progressively increase in difficulty are presented to the agent
99
([learn more here](Training-Curriculum-Learning.md)).
1010
* When possible, it is often helpful to ensure that you can complete the task by
11-
using a heuristic to control the agent. To do so, check the `Use Heuristic`
12-
checkbox on the Agent and implement the `Heuristic()` method on the Agent.
11+
using a heuristic to control the agent. To do so, set the `Behavior Type`
12+
to `Heuristic Only` on the Agent's Behavior Parameters, and implement the
13+
`Heuristic()` method on the Agent.
1314
* It is often helpful to make many copies of the agent, and give them the same
1415
`Behavior Name`. In this way the learning process can get more feedback
1516
information from all of these agents, which helps it train faster.

docs/Learning-Environment-Create-New.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -380,8 +380,8 @@ What this code means is that the heuristic will generate an action corresponding
380380
to the values of the "Horizontal" and "Vertical" input axis (which correspond to
381381
the keyboard arrow keys).
382382

383-
In order for the Agent to use the Heuristic, You will need to check the `Use Heuristic`
384-
checkbox in the `Behavior Parameters` of the RollerAgent.
383+
In order for the Agent to use the Heuristic, You will need to set the `Behavior Type`
384+
to `Heuristic Only` in the `Behavior Parameters` of the RollerAgent.
385385

386386

387387
Press **Play** to run the scene and use the arrows keys to move the Agent around

docs/Learning-Environment-Design-Agents.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,11 @@ discover the optimal decision-making policy.
1717
The Policy class abstracts out the decision making logic from the Agent itself so
1818
that you can use the same Policy in multiple Agents. How a Policy makes its
1919
decisions depends on the kind of Policy it is. You can change the Policy of an
20-
Agent by changing its `Behavior Parameters`. If you check `Use Heuristic`, the
21-
Agent will use its `Heuristic()` method to make decisions which can allow you to
22-
control the Agent manually or write your own Policy. If the Agent has a `Model`
23-
file, it Policy will use the neural network `Model` to take decisions.
20+
Agent by changing its `Behavior Parameters`. If you set `Behavior Type` to
21+
`Heuristic Only`, the Agent will use its `Heuristic()` method to make decisions
22+
which can allow you to control the Agent manually or write your own Policy. If
23+
the Agent has a `Model` file, it Policy will use the neural network `Model` to
24+
take decisions.
2425

2526
## Decisions
2627

docs/Learning-Environment-Examples.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,8 @@ If you would like to contribute environments, please see our
106106
* Goal: The agents must hit the ball so that the opponent cannot hit a valid
107107
return.
108108
* Agents: The environment contains two agent with same Behavior Parameters.
109-
After training you can check the `Use Heuristic` checkbox on one of the Agents
110-
to play against your trained model.
109+
After training you can set the `Behavior Type` to `Heuristic Only` on one of the Agent's
110+
Behavior Parameters to play against your trained model.
111111
* Agent Reward Function (independent):
112112
* +1.0 To the agent that wins the point. An agent wins a point by preventing
113113
the opponent from hitting a valid return.

0 commit comments

Comments
 (0)