Skip to content

Commit 78bb8c7

Browse files
authored
documentation touchups (#4099)
* doc updates getting started page now uses consistent run-id re-order create-new docs to have less back/forth between unity and text editor * add link explaining decisions where we tell the reader to modify its parameter
1 parent 8f0c8c7 commit 78bb8c7

File tree

2 files changed

+31
-29
lines changed

2 files changed

+31
-29
lines changed

docs/Getting-Started.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ If you've quit the training early using `Ctrl+C` and want to resume training,
236236
run the same command again, appending the `--resume` flag:
237237

238238
```sh
239-
mlagents-learn config/ppo/3DBall.yaml --run-id=firstRun --resume
239+
mlagents-learn config/ppo/3DBall.yaml --run-id=first3DBallRun --resume
240240
```
241241

242242
Your trained model will be at `results/<run-identifier>/<behavior_name>.nn` where

docs/Learning-Environment-Create-New.md

Lines changed: 30 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -269,7 +269,7 @@ component, `rBody`, using the `Rigidbody.AddForce` function:
269269
Vector3 controlSignal = Vector3.zero;
270270
controlSignal.x = action[0];
271271
controlSignal.z = action[1];
272-
rBody.AddForce(controlSignal * speed);
272+
rBody.AddForce(controlSignal * forceMultiplier);
273273
```
274274

275275
#### Rewards
@@ -313,14 +313,14 @@ With the action and reward logic outlined above, the final version of the
313313
`OnActionReceived()` function looks like:
314314

315315
```csharp
316-
public float speed = 10;
316+
public float forceMultiplier = 10;
317317
public override void OnActionReceived(float[] vectorAction)
318318
{
319319
// Actions, size = 2
320320
Vector3 controlSignal = Vector3.zero;
321321
controlSignal.x = vectorAction[0];
322322
controlSignal.z = vectorAction[1];
323-
rBody.AddForce(controlSignal * speed);
323+
rBody.AddForce(controlSignal * forceMultiplier);
324324

325325
// Rewards
326326
float distanceToTarget = Vector3.Distance(this.transform.localPosition, Target.localPosition);
@@ -340,33 +340,9 @@ public override void OnActionReceived(float[] vectorAction)
340340
}
341341
```
342342

343-
Note the `speed` class variable is defined before the function. Since `speed` is
343+
Note the `forceMultiplier` class variable is defined before the function. Since `forceMultiplier` is
344344
public, you can set the value from the Inspector window.
345345

346-
## Final Editor Setup
347-
348-
Now, that all the GameObjects and ML-Agent components are in place, it is time
349-
to connect everything together in the Unity Editor. This involves changing some
350-
of the Agent Component's properties so that they are compatible with our Agent
351-
code.
352-
353-
1. Select the **RollerAgent** GameObject to show its properties in the Inspector
354-
window.
355-
1. Add the `Decision Requester` script with the Add Component button from the
356-
RollerAgent Inspector.
357-
1. Change **Decision Period** to `10`.
358-
1. Drag the Target GameObject from the Hierarchy window to the RollerAgent
359-
Target field.
360-
1. Add the `Behavior Parameters` script with the Add Component button from the
361-
RollerAgent Inspector.
362-
1. Modify the Behavior Parameters of the Agent :
363-
- `Behavior Name` to _RollerBall_
364-
- `Vector Observation` > `Space Size` = 8
365-
- `Vector Action` > `Space Type` = **Continuous**
366-
- `Vector Action` > `Space Size` = 2
367-
368-
Now you are ready to test the environment before training.
369-
370346
## Testing the Environment
371347

372348
It is always a good idea to first test your environment by controlling the Agent
@@ -392,6 +368,30 @@ the platform. Make sure that there are no errors displayed in the Unity Editor
392368
Console window and that the Agent resets when it reaches its target or falls
393369
from the platform.
394370

371+
## Final Editor Setup
372+
373+
Now, that all the GameObjects and ML-Agent components are in place, it is time
374+
to connect everything together in the Unity Editor. This involves changing some
375+
of the Agent Component's properties so that they are compatible with our Agent
376+
code.
377+
378+
1. Select the **RollerAgent** GameObject to show its properties in the Inspector
379+
window.
380+
1. Add the `Decision Requester` script with the Add Component button from the
381+
RollerAgent Inspector.
382+
1. Change **Decision Period** to `10`. For more information on decisions, see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
383+
1. Drag the Target GameObject from the Hierarchy window to the RollerAgent
384+
Target field.
385+
1. Add the `Behavior Parameters` script with the Add Component button from the
386+
RollerAgent Inspector.
387+
1. Modify the Behavior Parameters of the Agent :
388+
- `Behavior Name` to _RollerBall_
389+
- `Vector Observation` > `Space Size` = 8
390+
- `Vector Action` > `Space Type` = **Continuous**
391+
- `Vector Action` > `Space Size` = 2
392+
393+
Now you are ready to test the environment before training.
394+
395395
## Training the Environment
396396

397397
The process is the same as described in the
@@ -427,6 +427,8 @@ behaviors:
427427
summary_freq: 10000
428428
```
429429
430+
Hyperparameters are explained in [the training configuration file documentation](Training-Configuration-File.md)
431+
430432
Since this example creates a very simple training environment with only a few
431433
inputs and outputs, using small batch and buffer sizes speeds up the training
432434
considerably. However, if you add more complexity to the environment or change

0 commit comments

Comments
 (0)