-
Notifications
You must be signed in to change notification settings - Fork 4.3k
1 to 1 Brain to Agent #2729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1 to 1 Brain to Agent #2729
Conversation
This is a work in progess In this PR : - Deleted all Brain Objects - Moved the BrainParameters into the Agent - Gave the Agent a Heuristic method (see Balance Ball for example) - Modified the Communicator and ModelRunner : Put can only take one agent at a time - Made the IBrain Interface with RequestDecision and DecideAction method No changes made to Python [Design Doc](https://docs.google.com/document/d/1hBhBxZ9lepGF4H6fc6Hu6AW7UwOmnyX3trmgI3HpOmo/edit#)
@@ -68,6 +65,15 @@ public override void AgentReset() | |||
SetResetParameters(); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what would replace the Heuristic and Player Brains
m_BrainFactoryParameters.brainParameters, | ||
model, | ||
m_BrainFactoryParameters.inferenceDevice, | ||
behaviorName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels strange to me that the agent knows about and can construct all of the concrete IBrain implementations. Seems like an information leak between these two objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an interesting point. Should I move this logic into a Factory or is the problem deeper than that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a Factory is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the instantiation of brains should happen external to the Agent class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is necessary for now until we get to the end of the refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made factory now but we should discuss this point further.
@@ -621,7 +640,7 @@ void SendInfoToBrain() | |||
m_Info.maxStepReached = m_MaxStepReached; | |||
m_Info.id = m_Id; | |||
|
|||
brain.SubscribeAgentForDecision(this); | |||
m_Brain.RequestDecision(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great if the agent only had to call RequestDecision
(much better name than SubscribeAgentForDecision
😂) and DecideAction
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what is going on in this PR?
Is there something I am missing to get there ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my comment lacked context. I like the fact that this is happening here. I feel like this is the only interaction Agent should have with a brain IMHO. I don't think Agent should be instantiating Brains itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I don't see how this would happen, since Agent and brains would be tightly coupled in this PR, I was thinking we should have the UI to create the brain live on the Agent inspector. The creation of the factory was to enable the creation of a brain anywhere so I assumed creating in the Agent. If something must pass the Brain to the Agent, we need to think of what that something is and how the user will parameterize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying this is a requirement for this PR to get approved, but there are a couple of red flags that pop out to me. Even though the factory 'hides' different classes of the IPolicy implementation, the remnants of their inner workings still remain.
Agents still know about:
- NNModels
- Heuristic functions
- Inference Device
- Whether or not the communicator is connected
- Whether or not to use a Heuristic brain (which is like a meta leak)
This is all information being leaked into the agent, when in theory, it could just care about the IPolicy it is interacting with and nothing else.
In terms of ways to get this done, it may be controversial to say, but if IPolicy's were components (MonoBehaviours) you could easily parameterize them and instantiate them without having a factory or any of this information in Agent. Agent could then call GetComponent<IPolicy>();
to retrieve the concrete Policy without knowing any of the details of its implementation. I know we've talked about moving away from Unity Dependencies, but to be honest, deriving from MonoBehaviour and being able to add it as a component feels better to me.
Yeah, I think the overall approach makes sense. It's slightly counterintuitive that the ModelRunner or Communicator 's DecideBatch() get called (# of agents) times and only the first one does any work, but I don't have a better way to do it. |
Feels weird too and I realize I did not explicitly call this out in the PR comment. Hopefully, this will be temporary and after the Observation and Action refactor, we will be able to return actual action values rather than void in DecideAction. |
Will put this on hold until #2731 is good to go |
* Delting all brains, setting Behavior Parameters * Removing learning from all the configs * Forgot one agent * Removing leftover brains * Dead meta file * Adding Heuristics
Overall it looks good. I'd like to discuss the instantiation of Policies more, but that shouldn't block this from going in. |
* Made Policy Factory a component * Broke the Heuristic * fix bug * BugFix * Reimplemented Heuristic * Changing all of the prefabs * forgotten file * Fixing under which conditions the Heuristic policy is used * Dispose of brain
@@ -193,7 +194,7 @@ bool IsCommunicatorOn | |||
|
|||
// Signals to all the Brains at each environment step so they can decide | |||
// actions for their agents. | |||
public event System.Action BrainDecideAction; | |||
public event System.Action DecideAction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: update "Brains " in comment above
@@ -524,7 +534,7 @@ void EnvironmentStep() | |||
|
|||
using (TimerStack.Instance.Scoped("BrainDecideAction")) | |||
{ | |||
BrainDecideAction?.Invoke(); | |||
DecideAction?.Invoke(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Update timer name ("BrainDecideAction")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Hopefully it doesn't conflict too much with my changes...
* initial changes to the documentation * More documentation changes, not done. * More documentation changes * More docs * Changed the images * addressing comments * Adding one line to the migrating doc
I think this was broken by this change. The checks for visual obs resolution was ignored no? |
can you merge the latest develop in to get the Yamato pipelines to run? |
This is a work in progress
In this PR :
No changes made to Python
Design Doc
To do before ready :
Editing all the example scenesEdit comments in code