1 to 1 Brain to Agent #2729

vincentpierre · 2019-10-14T22:07:02Z

This is a work in progress
In this PR :

Deleted all Brain Objects
Moved the BrainParameters into the Agent
Gave the Agent a Heuristic method (see Balance Ball for example)
Modified the Communicator and ModelRunner : Put can only take one agent at a time
Made the IBrain Interface with RequestDecision and DecideAction method
Renaming Brain to Policy

No changes made to Python
Design Doc

To do before ready :

Changes to documentation Develop one to one documentation #2742
~~Editing all the example scenes~~
~~Edit comments in code~~

This is a work in progess In this PR : - Deleted all Brain Objects - Moved the BrainParameters into the Agent - Gave the Agent a Heuristic method (see Balance Ball for example) - Modified the Communicator and ModelRunner : Put can only take one agent at a time - Made the IBrain Interface with RequestDecision and DecideAction method No changes made to Python [Design Doc](https://docs.google.com/document/d/1hBhBxZ9lepGF4H6fc6Hu6AW7UwOmnyX3trmgI3HpOmo/edit#)

UnitySDK/Assets/ML-Agents/.editorconfig

vincentpierre · 2019-10-14T22:37:43Z

UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs

@@ -68,6 +65,15 @@ public override void AgentReset()
        SetResetParameters();
    }



This is what would replace the Heuristic and Player Brains

surfnerd · 2019-10-14T23:07:03Z

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

+                m_BrainFactoryParameters.brainParameters,
+                model,
+                m_BrainFactoryParameters.inferenceDevice,
+                behaviorName);


It feels strange to me that the agent knows about and can construct all of the concrete IBrain implementations. Seems like an information leak between these two objects.

That is an interesting point. Should I move this logic into a Factory or is the problem deeper than that?

I think a Factory is sufficient.

I think the instantiation of brains should happen external to the Agent class.

Maybe this is necessary for now until we get to the end of the refactor.

Made factory now but we should discuss this point further.

surfnerd · 2019-10-14T23:08:03Z

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

@@ -621,7 +640,7 @@ void SendInfoToBrain()
            m_Info.maxStepReached = m_MaxStepReached;
            m_Info.id = m_Id;

-            brain.SubscribeAgentForDecision(this);
+            m_Brain.RequestDecision(this);


would be great if the agent only had to call RequestDecision (much better name than SubscribeAgentForDecision 😂) and DecideAction.

Isn't that what is going on in this PR?
Is there something I am missing to get there ?

Sorry, my comment lacked context. I like the fact that this is happening here. I feel like this is the only interaction Agent should have with a brain IMHO. I don't think Agent should be instantiating Brains itself.

Interesting. I don't see how this would happen, since Agent and brains would be tightly coupled in this PR, I was thinking we should have the UI to create the brain live on the Agent inspector. The creation of the factory was to enable the creation of a brain anywhere so I assumed creating in the Agent. If something must pass the Brain to the Agent, we need to think of what that something is and how the user will parameterize it.

Not saying this is a requirement for this PR to get approved, but there are a couple of red flags that pop out to me. Even though the factory 'hides' different classes of the IPolicy implementation, the remnants of their inner workings still remain.
Agents still know about:

NNModels

Heuristic functions

Inference Device

Whether or not the communicator is connected

Whether or not to use a Heuristic brain (which is like a meta leak)

This is all information being leaked into the agent, when in theory, it could just care about the IPolicy it is interacting with and nothing else.

In terms of ways to get this done, it may be controversial to say, but if IPolicy's were components (MonoBehaviours) you could easily parameterize them and instantiate them without having a factory or any of this information in Agent. Agent could then call GetComponent<IPolicy>(); to retrieve the concrete Policy without knowing any of the details of its implementation. I know we've talked about moving away from Unity Dependencies, but to be honest, deriving from MonoBehaviour and being able to add it as a component feels better to me.

UnitySDK/Assets/ML-Agents/Scripts/IBrain.cs

UnitySDK/Assets/ML-Agents/Scripts/ICommunicator.cs

UnitySDK/Assets/ML-Agents/Scripts/InferenceBrain/ModelRunner.cs

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

chriselion · 2019-10-15T00:15:37Z

Yeah, I think the overall approach makes sense. It's slightly counterintuitive that the ModelRunner or Communicator 's DecideBatch() get called (# of agents) times and only the first one does any work, but I don't have a better way to do it.

vincentpierre · 2019-10-15T00:21:16Z

Yeah, I think the overall approach makes sense. It's slightly counterintuitive that the ModelRunner or Communicator 's DecideBatch() get called (# of agents) times and only the first one does any work, but I don't have a better way to do it.

Feels weird too and I realize I did not explicitly call this out in the PR comment. Hopefully, this will be temporary and after the Observation and Action refactor, we will be able to return actual action values rather than void in DecideAction.

vincentpierre · 2019-10-15T18:42:56Z

Will put this on hold until #2731 is good to go

* Delting all brains, setting Behavior Parameters * Removing learning from all the configs * Forgot one agent * Removing leftover brains * Dead meta file * Adding Heuristics

surfnerd · 2019-10-16T22:30:17Z

Overall it looks good. I'd like to discuss the instantiation of Policies more, but that shouldn't block this from going in.

surfnerd · 2019-10-17T17:56:14Z

nit: Could we move all of the Concrete Policies to a 'Policy' folder.

* Made Policy Factory a component * Broke the Heuristic * fix bug * BugFix * Reimplemented Heuristic * Changing all of the prefabs * forgotten file * Fixing under which conditions the Heuristic policy is used * Dispose of brain

chriselion · 2019-10-21T19:35:18Z

UnitySDK/Assets/ML-Agents/Scripts/Academy.cs

@@ -193,7 +194,7 @@ bool IsCommunicatorOn

        // Signals to all the Brains at each environment step so they can decide
        // actions for their agents.
-        public event System.Action BrainDecideAction;
+        public event System.Action DecideAction;


nit: update "Brains " in comment above

chriselion · 2019-10-21T19:35:45Z

UnitySDK/Assets/ML-Agents/Scripts/Academy.cs

@@ -524,7 +534,7 @@ void EnvironmentStep()

            using (TimerStack.Instance.Scoped("BrainDecideAction"))
            {
-                BrainDecideAction?.Invoke();
+                DecideAction?.Invoke();


nit: Update timer name ("BrainDecideAction")

chriselion

Looks good. Hopefully it doesn't conflict too much with my changes...

* initial changes to the documentation * More documentation changes, not done. * More documentation changes * More docs * Changed the images * addressing comments * Adding one line to the migrating doc

vincentpierre · 2019-10-23T00:06:06Z

Would this be causing problems on develop now?

I think this was broken by this change. The checks for visual obs resolution was ignored no?

surfnerd · 2019-10-23T00:35:59Z

can you merge the latest develop in to get the Yamato pipelines to run?

vincentpierre requested review from surfnerd and chriselion October 14, 2019 22:07

vincentpierre self-assigned this Oct 14, 2019

vincentpierre commented Oct 14, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/.editorconfig Outdated Show resolved Hide resolved

vincentpierre commented Oct 14, 2019

View reviewed changes

vincentpierre added 2 commits October 14, 2019 15:40

Removing editorconfig

43217b6

Updating BallanceBall scene

c4c3826

surfnerd reviewed Oct 14, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/IBrain.cs Outdated Show resolved Hide resolved

surfnerd reviewed Oct 14, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/IBrain.cs Outdated Show resolved Hide resolved

surfnerd reviewed Oct 14, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/IBrain.cs Outdated Show resolved Hide resolved

surfnerd reviewed Oct 14, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/ICommunicator.cs Outdated Show resolved Hide resolved

grammar mistake

9478281

chriselion reviewed Oct 14, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/InferenceBrain/ModelRunner.cs Show resolved Hide resolved

Clearing the Agents of the Model runner

23edabd

chriselion reviewed Oct 15, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs Outdated Show resolved Hide resolved

vincentpierre added 9 commits October 15, 2019 15:02

Added Documentation on IBrain

def23e3

Modified comments on GiveModel

324758d

Introduced a factory

356c23c

Split Learning Brain in two

b339247

Changes to walljump

31766e7

Fixing the Unit tests

2776d05

Renaming the Brain to Policy

c47c349

Heuristic now has priority over training

e832960

Edited code comments

0bb25d6

Fixing bugs

4bfb530

vincentpierre marked this pull request as ready for review October 16, 2019 21:14

vincentpierre added 2 commits October 16, 2019 15:19

Resolving conflicts

49152e4

Develop one to one scene edits (#2744)

2f12007

* Delting all brains, setting Behavior Parameters * Removing learning from all the configs * Forgot one agent * Removing leftover brains * Dead meta file * Adding Heuristics

vincentpierre added 3 commits October 17, 2019 11:10

Moving the policies in a separate folder

e15500d

Missing meta file

4c01bee

Develop one to one with component (#2753)

3ca1516

* Made Policy Factory a component * Broke the Heuristic * fix bug * BugFix * Reimplemented Heuristic * Changing all of the prefabs * forgotten file * Fixing under which conditions the Heuristic policy is used * Dispose of brain

chriselion reviewed Oct 21, 2019

View reviewed changes

chriselion approved these changes Oct 21, 2019

View reviewed changes

vincentpierre added 5 commits October 21, 2019 14:24

Removing references to Brain in the Academy comments;

87c535d

Develop one to one documentation (#2742)

03d6712

* initial changes to the documentation * More documentation changes, not done. * More documentation changes * More docs * Changed the images * addressing comments * Adding one line to the migrating doc

resolving conflicts

0b47fd7

Removing warning in the Agent Inspector when Vis Obs is used

c0014e6

Fixing C# unit tests

90fac1f

surfnerd approved these changes Oct 23, 2019

View reviewed changes

Merge branch 'develop' into develop-one-to-one

6715a14

vincentpierre merged commit 22afeef into develop Oct 23, 2019

vincentpierre deleted the develop-one-to-one branch October 23, 2019 00:52

beluis3d mentioned this pull request Nov 27, 2019

Zero Brains (Agents/Policies) in getting_started notebook #2992

Closed

github-actions bot locked as resolved and limited conversation to collaborators May 17, 2021

		@@ -68,6 +65,15 @@ public override void AgentReset()
		SetResetParameters();
		}

1 to 1 Brain to Agent #2729

1 to 1 Brain to Agent #2729

Uh oh!

Conversation

vincentpierre commented Oct 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

surfnerd Oct 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chriselion commented Oct 15, 2019

Uh oh!

vincentpierre commented Oct 15, 2019

Uh oh!

vincentpierre commented Oct 15, 2019

Uh oh!

surfnerd commented Oct 16, 2019

Uh oh!

surfnerd commented Oct 17, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chriselion left a comment

Choose a reason for hiding this comment

Uh oh!

vincentpierre commented Oct 23, 2019

Uh oh!

surfnerd commented Oct 23, 2019

Uh oh!

Uh oh!

vincentpierre commented Oct 14, 2019 •

edited

Loading

surfnerd Oct 16, 2019 •

edited

Loading