Unity-Technologies · vincentpierre · Oct 21, 2019 · Oct 16, 2019 · Oct 16, 2019 · Oct 16, 2019
diff --git a/docs/Background-TensorFlow.md b/docs/Background-TensorFlow.md
@@ -17,7 +17,7 @@ performing computations using data flow graphs, the underlying representation of
 deep learning models. It facilitates training and inference on CPUs and GPUs in
 a desktop, server, or mobile device. Within the ML-Agents toolkit, when you
 train the behavior of an agent, the output is a TensorFlow model (.nn) file
-that you can then embed within a Learning Brain. Unless you implement a new
+that you can then associate with an Agent. Unless you implement a new
 algorithm, the use of TensorFlow is mostly abstracted away and behind the
 scenes.
 

diff --git a/docs/Basic-Guide.md b/docs/Basic-Guide.md
@@ -35,26 +35,20 @@ inside Unity. In this section, we will use the pre-trained model for the
 1. In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Scenes` folder
    and open the `3DBall` scene file.
 2. In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Prefabs` folder. 
-   Expand `Game` and click on the `Platform` prefab.  You should see the `Platform` prefab in the **Inspector** window.
+   Expand `3DBall` and click on the `Agent` prefab.  You should see the `Agent` prefab in the **Inspector** window.
 
-   **Note**: The platforms in the `3DBall` scene were created using the `Platform` prefab.  Instead of updating all 12 platforms individually, you can update the `Platform` prefab instead.
+   **Note**: The platforms in the `3DBall` scene were created using the `3DBall` prefab.  Instead of updating all 12 platforms individually, you can update the `3DBall` prefab instead.
 
    ![Platform Prefab](images/platform_prefab.png)
 
-3. In the **Project** window, drag the **3DBallLearning** Brain located in 
-   `Assets/ML-Agents/Examples/3DBall/Brains` into the `Brain` property under `Ball 3D Agent (Script)` component in the **Inspector** window.
+3. In the **Project** window, drag the **3DBallLearning** Model located in 
+   `Assets/ML-Agents/Examples/3DBall/TFModels` into the `Model` property under `Ball 3D Agent (Script)` component in the **Inspector** window.
 
    ![3dball learning brain](images/3dball_learning_brain.png)
 
-4. You should notice that each `Platform` under each `Game` in the **Hierarchy** windows now contains **3DBallLearning** as `Brain`. __Note__ : You can modify multiple game objects in a scene by selecting them all at 
+4. You should notice that each `Agent` under each `3DBall` in the **Hierarchy** windows now contains **3DBallLearning** as `Model`. __Note__ : You can modify multiple game objects in a scene by selecting them all at 
    once using the search bar in the Scene Hierarchy. 
-5. In the **Project** window, click on the **3DBallLearning** Brain located in 
-   `Assets/ML-Agents/Examples/3DBall/Brains`.  You should see the properties in the **Inspector** window.
-6. In the **Project** window, open the `Assets/ML-Agents/Examples/3DBall/TFModels` 
-   folder.
-7. Drag the `3DBallLearning` model file from the `Assets/ML-Agents/Examples/3DBall/TFModels` 
-   folder to the **Model** field of the **3DBallLearning** Brain in the **Inspector** window. __Note__ : All of the brains should now have `3DBallLearning` as the TensorFlow model in the `Model` property 
-8. Select the **InferenceDevice** to use for this model (CPU or GPU). 
+8. Select the **InferenceDevice** to use for this model (CPU or GPU) on the Agent. 
    _Note: CPU is faster for the majority of ML-Agents toolkit generated models_
 9. Click the **Play** button and you will see the platforms balance the balls
    using the pre-trained model.
@@ -73,22 +67,19 @@ if you want to [use an executable](Learning-Environment-Executable.md) or to
 More information and documentation is provided in the
 [Python API](Python-API.md) page.
 
-## Training the Brain with Reinforcement Learning
+## Training the Model with Reinforcement Learning
 
 ### Setting up the environment for training
 
-To set up the environment for training, you will need to specify which agents are contributing
-to the training and which Brain is being trained. You can only perform training with
-a `Learning Brain`.
-
-Each platform agent needs an assigned `Learning Brain`.  In this example, each platform agent was created using a prefab.  To update all of the brains in each platform agent at once, you only need to update the platform agent prefab.  In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Prefabs` folder. Expand `Game` and click on the `Platform` prefab.  You should see the `Platform` prefab in the **Inspector** window.  In the **Project** window, drag the **3DBallLearning** Brain located in  `Assets/ML-Agents/Examples/3DBall/Brains` into the `Brain` property under `Ball 3D Agent (Script)` component in the **Inspector** window.  
-
-   **Note**: The Unity prefab system will modify all instances of the agent properties in your scene.  If the agent does not synchronize automatically with the prefab, you can hit the Revert button in the top of the **Inspector** window.
-
-   **Note:** Assigning a Brain to an agent (dragging a Brain into the `Brain` property of 
-
-the agent) means that the Brain will be making decision for that agent. If the Agent uses a
-LearningBrain either Python controls the Brain or the model on the Brain does.
+In order to setup the Agents for Training, you will need to edit the 
+`Behavior Name` under `BehaviorParamters` in the Agent Inspector window.
+The `Behavior Name` is used to group agents per behaviors. Note that Agents
+sharing the same `Behavior Name` must be agents of the same type using the
+same `Behavior Parameters`. You can make sure all your agents have the same 
+`Behavior Parameters` using Prefabs.
+The `Behavior Name` corresponds to the name of the model that will be 
+generated by the training process and is used to select the hyperparameters
+from the training configuration file.
 
 ### Training the environment
 
@@ -216,22 +207,22 @@ INFO:mlagents.trainers: first-run-0: 3DBallLearning: Step: 10000. Mean Reward: 2
 ### After training
 
 You can press Ctrl+C to stop the training, and your trained model will be at
-`models/<run-identifier>/<brain_name>.nn` where
-`<brain_name>` is the name of the Brain corresponding to the model.
+`models/<run-identifier>/<behavior_name>.nn` where
+`<behavior_name>` is the name of the `Behavior Name` of the agents corresponding to the model.
 (**Note:** There is a known bug on Windows that causes the saving of the model to
 fail when you early terminate the training, it's recommended to wait until Step
 has reached the max_steps parameter you set in trainer_config.yaml.) This file
 corresponds to your model's latest checkpoint. You can now embed this trained
-model into your Learning Brain by following the steps below, which is similar to
+model into your Agents by following the steps below, which is similar to
 the steps described
 [above](#running-a-pre-trained-model).
 
 1. Move your model file into
    `UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/`.
 2. Open the Unity Editor, and select the **3DBall** scene as described above.
-3. Select the  **3DBallLearning** Learning Brain from the Scene hierarchy.
-4. Drag the `<brain_name>.nn` file from the Project window of
-   the Editor to the **Model** placeholder in the **3DBallLearning**
+3. Select the  **3DBall** prefab Agent object.
+4. Drag the `<behavior_name>.nn` file from the Project window of
+   the Editor to the **Model** placeholder in the **Ball3DAgent**
    inspector window.
 5. Press the :arrow_forward: button at the top of the Editor.
 

diff --git a/docs/Creating-Custom-Protobuf-Messages.md b/docs/Creating-Custom-Protobuf-Messages.md
@@ -165,7 +165,7 @@ In Python, the custom field would be accessed like:
 ```python
 ...
 result = env.step(...)
-result[brain_name].custom_observations[0].customField
+result[behavior_name].custom_observations[0].customField
 ```
 
-where `brain_name` is the name of the brain attached to the agent.
+where `behavior_name` is the `Behavior Name` property of the Agent.
diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -44,7 +44,7 @@ UnityAgentsException: The Communicator was unable to connect. Please make sure t
 
 There may be a number of possible causes:
 
-* _Cause_: There may be no agent in the scene with a LearningBrain
+* _Cause_: There may be no agent in the scene 
 * _Cause_: On OSX, the firewall may be preventing communication with the
   environment. _Solution_: Add the built environment binary to the list of
   exceptions on the firewall by following

diff --git a/docs/Feature-Memory.md b/docs/Feature-Memory.md
@@ -9,7 +9,7 @@ It is now possible to give memories to your agents. When training, the agents
 will be able to store a vector of floats to be used next time they need to make
 a decision.
 
-![Brain Inspector](images/ml-agents-LSTM.png)
+![Inspector](images/ml-agents-LSTM.png)
 
 Deciding what the agents should remember in order to solve a task is not easy to
 do by hand, but our training algorithms can learn to keep track of what is
@@ -19,7 +19,7 @@ important to remember with
 ## How to use
 
 When configuring the trainer parameters in the `config/trainer_config.yaml`
-file, add the following parameters to the Brain you want to use.
+file, add the following parameters to the Behavior you want to use.
 
 ```json
 use_recurrent: true

diff --git a/docs/Getting-Started-with-Balance-Ball.md b/docs/Getting-Started-with-Balance-Ball.md
@@ -32,7 +32,7 @@ and Unity, see the [installation instructions](Installation.md).
 
 An agent is an autonomous actor that observes and interacts with an
 _environment_. In the context of Unity, an environment is a scene containing an
-Academy and one or more Brain and Agent objects, and, of course, the other
+Academy and one or more Agent objects, and, of course, the other
 entities that an agent interacts with.
 
 ![Unity Editor](images/mlagents-3DBallHierarchy.png)
@@ -45,7 +45,7 @@ window. The Inspector shows every component on a GameObject.
 
 The first thing you may notice after opening the 3D Balance Ball scene is that
 it contains not one, but several agent cubes.  Each agent cube in the scene is an
-independent agent, but they all share the same Brain. 3D Balance Ball does this
+independent agent, but they all share the same Behavior. 3D Balance Ball does this
 to speed up training since all twelve agents contribute to training in parallel.
 
 ### Academy
@@ -82,68 +82,16 @@ The 3D Balance Ball environment does not use these functions — each Agent rese
 itself when needed — but many environments do use these functions to control the
 environment around the Agents.
 
-### Brain
-
-As of v0.6, a Brain is a Unity asset and exists within the `UnitySDK` folder. These brains (ex. **3DBallLearning.asset**) are loaded into each Agent object (ex. **Ball3DAgents**).  A Brain doesn't store any information about an Agent, it just
-routes the Agent's collected observations to the decision making process and
-returns the chosen action to the Agent. All Agents can share the same
-Brain, but would act independently. The Brain settings tell you quite a bit about how
-an Agent works.
-
-You can create new Brain assets by selecting `Assets ->
-Create -> ML-Agents -> Brain`. There are 3 types of Brains.
-The **Learning Brain** is a Brain that uses a trained neural network to make decisions.
-When Unity is connected to Python, the external process will be controlling the Brain. 
-The external process that is training the neural network will take over decision making for the agents
-and ultimately generate a trained neural network. You can also use the
-**Learning Brain** with a pre-trained model.
-The **Heuristic** Brain allows you to hand-code the Agent logic by extending
-the Decision class.
-Finally, the **Player** Brain lets you map keyboard commands to actions, which
-can be useful when testing your agents and environment. You can also implement your own type of Brain.
-
-In this tutorial, you will use the **Learning Brain** for training.
-
-#### Vector Observation Space
-
-Before making a decision, an agent collects its observation about its state in
-the world. The vector observation is a vector of floating point numbers which
-contain relevant information for the agent to make decisions.
-
-The Brain instance used in the 3D Balance Ball example uses the **Continuous**
-vector observation space with a **State Size** of 8. This means that the feature
-vector containing the Agent's observations contains eight elements: the `x` and
-`z` components of the agent cube's rotation and the `x`, `y`, and `z` components
-of the ball's relative position and velocity. (The observation values are
-defined in the Agent's `CollectObservations()` function.)
-
-#### Vector Action Space
-
-An Agent is given instructions from the Brain in the form of *actions*.
-ML-Agents toolkit classifies actions into two types: the **Continuous** vector
-action space is a vector of numbers that can vary continuously. What each
-element of the vector means is defined by the Agent logic (the PPO training
-process just learns what values are better given particular state observations
-based on the rewards received when it tries different values). For example, an
-element might represent a force or torque applied to a `Rigidbody` in the Agent.
-The **Discrete** action vector space defines its actions as tables. An action
-given to the Agent is an array of indices into tables.
-
-The 3D Balance Ball example is programmed to use both types of vector action
-space. You can try training with both settings to observe whether there is a
-difference. (Set the `Vector Action Space Size` to 4 when using the discrete
-action space and 2 when using continuous.)
-
 ### Agent
 
 The Agent is the actor that observes and takes actions in the environment. In
 the 3D Balance Ball environment, the Agent components are placed on the twelve
 "Agent" GameObjects. The base Agent object has a few properties that affect its
 behavior:
 
-* **Brain** — Every Agent must have a Brain. The Brain determines how an Agent
-  makes decisions. All the Agents in the 3D Balance Ball scene share the same
-  Brain.
+* **Behavior Parameters** — Every Agent must have a Behavior. The Behavior 
+  determines how an Agent makes decisions. More on Behavior Parameters in
+  the next section.
 * **Visual Observations** — Defines any Camera objects used by the Agent to
   observe its environment. 3D Balance Ball does not use camera observations.
 * **Max Step** — Defines how many simulation steps can occur before the Agent
@@ -162,22 +110,54 @@ The Ball3DAgent subclass defines the following methods:
   training generalizes to more than a specific starting position and agent cube
   attitude.
 * agent.CollectObservations() — Called every simulation step. Responsible for
-  collecting the Agent's observations of the environment. Since the Brain
-  instance assigned to the Agent is set to the continuous vector observation
+  collecting the Agent's observations of the environment. Since the Behavior 
+  Parameters of the Agent are set with vector observation
   space with a state size of 8, the `CollectObservations()` must call
-  `AddVectorObs` such that  vector size adds up to 8.
+  `AddVectorObs` such that vector size adds up to 8.
 * agent.AgentAction() — Called every simulation step. Receives the action chosen
-  by the Brain. The Ball3DAgent example handles both the continuous and the
-  discrete action space types. There isn't actually much difference between the
-  two state types in this environment — both vector action spaces result in a
+  by the Agent. The vector action spaces result in a
   small change in the agent cube's rotation at each step. The `AgentAction()` function
   assigns a reward to the Agent; in this example, an Agent receives a small
   positive reward for each step it keeps the ball on the agent cube's head and a larger,
   negative reward for dropping the ball. An Agent is also marked as done when it
   drops the ball so that it will reset with a new ball for the next simulation
   step.
+* agent.Heuristic() - When the `Use Heuristic` checkbox is checked in the Behavior 
+  Parameters of the Agent, the Agent will use the `Heuristic()` method to generate
+  the actions of the Agent. As such, the `Heuristic()` method returns an array of
+  floats. In the case of the Ball 3D Agent, the `Heuristic()` method converts the
+  keyboard inputs into actions.  
+
+
+#### Behavior Parameters : Vector Observation Space
+
+Before making a decision, an agent collects its observation about its state in
+the world. The vector observation is a vector of floating point numbers which
+contain relevant information for the agent to make decisions.
+
+The Behavior Parameters of the 3D Balance Ball example uses a **Space Size** of 8.
+This means that the feature
+vector containing the Agent's observations contains eight elements: the `x` and
+`z` components of the agent cube's rotation and the `x`, `y`, and `z` components
+of the ball's relative position and velocity. (The observation values are
+defined in the Agent's `CollectObservations()` function.)
+
+#### Behavior Parameters : Vector Action Space
+
+An Agent is given instructions in the form of a float array of *actions*.
+ML-Agents toolkit classifies actions into two types: the **Continuous** vector
+action space is a vector of numbers that can vary continuously. What each
+element of the vector means is defined by the Agent logic (the training
+process just learns what values are better given particular state observations
+based on the rewards received when it tries different values). For example, an
+element might represent a force or torque applied to a `Rigidbody` in the Agent.
+The **Discrete** action vector space defines its actions as tables. An action
+given to the Agent is an array of indices into tables.
+
+The 3D Balance Ball example is programmed to use continuous action
+space with `Space Size` of 2.
 
-## Training the Brain with Reinforcement Learning
+## Training with Reinforcement Learning
 
 Now that we have an environment, we can perform the training.
 
@@ -272,11 +252,11 @@ From TensorBoard, you will see the summary statistics:
 
 ![Example TensorBoard Run](images/mlagents-TensorBoard.png)
 
-## Embedding the Trained Brain into the Unity Environment (Experimental)
+## Embedding the Model into the Unity Environment
 
 Once the training process completes, and the training process saves the model
 (denoted by the `Saved Model` message) you can add it to the Unity project and
-use it with Agents having a **Learning Brain**.
+use it with compatible Agents (the Agents that generated the model).
 __Note:__ Do not just close the Unity Window once the `Saved Model` message appears.
 Either wait for the training process to close the window or press Ctrl+C at the
 command-line prompt. If you close the window manually, the `.nn` file
@@ -285,6 +265,6 @@ containing the trained model is not exported into the ml-agents folder.
 ### Embedding the trained model into Unity
 
 To embed the trained model into Unity, follow the later part of [Training the
-Brain with Reinforcement
-Learning](Basic-Guide.md#training-the-brain-with-reinforcement-learning) section
+Model with Reinforcement
+Learning](Basic-Guide.md#training-the-model-with-reinforcement-learning) section
 of the Basic Guide page.
diff --git a/docs/Glossary.md b/docs/Glossary.md
@@ -6,13 +6,13 @@
   environment.
 * **Agent** - Unity Component which produces observations and takes actions in
   the environment. Agents actions are determined by decisions produced by a
-  linked Brain.
-* **Brain** - Unity Asset which makes decisions for the agents linked to it.
-* **Decision** - The specification produced by a Brain for an action to be
+  Policy.
+* **Policy** - The decision making mechanism, typically a neural network model.
+* **Decision** - The specification produced by a Policy for an action to be
   carried out given an observation.
 * **Editor** - The Unity Editor, which may include any pane (e.g. Hierarchy,
   Scene, Inspector).
-* **Environment** - The Unity scene which contains Agents, Academy, and Brains.
+* **Environment** - The Unity scene which contains Agents and the Academy.
 * **FixedUpdate** - Unity method called each time the game engine is
   stepped. ML-Agents logic should be placed here.
 * **Frame** - An instance of rendering the main camera for the display.
@@ -31,4 +31,4 @@
 * **External Coordinator** - ML-Agents class responsible for communication with
   outside processes (in this case, the Python API).
 * **Trainer** - Python class which is responsible for training a given 
-  Brain. Contains TensorFlow graph which makes decisions for Learning Brain.
+  group of Agents.