Trim some public fields on the Agent #3269

vincentpierre · 2020-01-22T01:47:54Z

Removed the following fields from the Agent :

~~SetRewards()~~
agentParameters.maxStep --> maxStep
~~IsDone()~~
~~IsMaxStepReached()~~
Info
ResetReward()
GetReward()
GetValueEstimates()
UpdateValueAction()
VectorAction.Value

UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs

surfnerd · 2020-01-22T18:46:11Z

~~Before approving this, I'd like to see some of the training results with these changes. I'm not sure how modifying the reward function in this way will change things.~~

chriselion · 2020-01-22T18:53:22Z

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

-        /// <returns>
-        /// <c>true</c>, if max step reached was reached, <c>false</c> otherwise.
-        /// </returns>
-        public bool IsMaxStepReached()


These seem like they would be useful for users to have access to (although maybe reworking them to properties).

I don't see a use case where the user knowing when the Agent is Done or reached max step would be useful. Do you have an example ?

I do not have an example, but I'm sure some users will after we remove it, and it seems like very little overhead to keep.

IsMaxStepReached can be derived from GetStepCount and maxStep, so I that might be OK to remove. But I think keeping it as a method hides the internals better (i.e. you have to know about maxStep <= 0 ==> IsMaxStepReached is never true)

For Done(), it's generally really frustrating to be able to set something but not read it's value.

I don't see a use case where the user knowing when the Agent is Done or reached max step would be useful. Do you have an example?

@vincentpierre I don't think the argument "I can't think of a reason to have it" is sufficient to remove the accessor of a state that is directly modifiable by the user. We use it internally in Agent for a number of reasons. One of the thousands of users that we have may use the IsDone accessor for reasons we haven't thought of. Whether it's custom UI, an event trigger for something else. I'm not sure how they might use it, but it feels really strange to me to let users modify the state of Agent without being to ask what that state is.

I would argue that, in general, if we are allowing a user to modify a state, we should also allow them to access that state.

a concrete of why we should be able to read it is right in our own tests:

ml-agents/UnitySDK/Assets/ML-Agents/Editor/Tests/MLAgentsEditModeTest.cs

Lines 23 to 26 in 106e5b4

public bool IsDone()

{

return (bool)typeof(Agent).GetField("m_Done", BindingFlags.Instance | BindingFlags.NonPublic).GetValue(this);

}

Yes, we are testing the inner working of the Agent in the tests. I think it is okay, I suppose m_Done can be marked as an internal field on the Agent rather than private.
I think what we need to have access to for testing and for using the tool are very different things.

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

surfnerd · 2020-01-22T18:55:41Z

@andrewcoh @ervteng can you guys look and leave feedback. Arguing for/against this is out of my area of expertise for the removal of SetReward

ervteng · 2020-01-22T19:02:56Z

I'm not in support of removing SetReward(). It makes things less confusing a little bit but introduces many cases which are harder to implement. For instance, if you have an incremental reward for velocity/survival but if the Agent hits a wall or falls off the platform, you want to kill it and give -1 reward.

So you put the SetReward(-1) and Done() in the collider OnTriggerEnter. Without SetReward or any way to read the current reward, you'll have no idea how much incremental reward was gathered by the time you hit the callback, and the Agent gets some unknown reward > -1.

awjuliani · 2020-01-22T19:29:09Z

Agree with @ervteng. SetReward() serves a very specific purpose that is of value in a number of different kinds of games and environments.

vincentpierre · 2020-01-22T20:24:54Z

I reverted my changes to SetReward.

andrewcoh · 2020-01-22T20:28:39Z

Seems like it could be a little late but I also agree with @ervteng . The SetReward enables implementation of reward functions that would not be possible otherwise. Should we name it something more explicit like SetFullStepReward so that it's less confusing?

ervteng · 2020-01-22T20:45:50Z

Seems like it could be a little late but I also agree with @ervteng . The SetReward enables implementation of reward functions that would not be possible otherwise. Should we name it something more explicit like SetFullStepReward so that it's less confusing?

Some other possibilities:

Change name to something like SetIncrementalReward()
Remove SetReward, add method to reset reward to 0 (ResetIncrementalReward()?)
Add method to get current reward (GetIncrementalReward?)
Remove SetReward, allow passing a final reward to Done()

surfnerd · 2020-01-22T20:47:08Z

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

-        /// <summary>
-        /// Resets the step reward and possibly the episode reward for the agent.
-        /// </summary>
-        public void ResetReward()


The places were this was removed don't account for the case where m_Done is true

The reset logic is really confusing. We have:

ResetIfDone

_AgentReset

AgentReset

ResetData

ForceReset

Just by looking, I have no idea which is done when or why.

This issue may not be in the scope of PR, but it feels like something that should be addressed.

They do, I reset the Cumulative reward in ResetData

and ResetReward

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

vincentpierre · 2020-01-22T22:09:10Z

I will not rename SetReward in this PR. I think we can do that later

andrewcoh · 2020-01-22T22:55:56Z

Remove SetReward, allow passing a final reward to Done()

@ervteng I think there are situations other than termination states where we might want the functionality of SetReward

vincentpierre · 2020-01-22T23:13:37Z

On the question of the IsDone(), I think of it more like an event than a change of state. I think it is okay to prevent the user to seeing if the Agent is Done since it is an internal state that is only relevant for the Policy, not to the game mechanic.
I am in general more in favor of removing APIs that are not used most of the time. If there is really a use case for knowing if an Agent is done and has yet to reset, I am curious...

On the SetReward topic, I think you are very unanimous that it is useful (so I won't fight too long) but I think it introduces a lot of confusion (that I don't think renaming would entirely solve). I personally think adding a reward should be an irrevocable action.

surfnerd · 2020-01-23T17:52:37Z

I think it is okay to prevent the user to seeing if the Agent is Done since it is an internal state that is only relevant for the Policy, not to the game mechanic.

AgentInfo has a public done member on it. It's not really removing the API, just hiding it in one place arbitrarily and passing it to another as part of our public API.

vincentpierre · 2020-01-23T18:13:51Z

AgentInfo has a public done member on it. It's not really removing the API, just hiding it in one place arbitrarily and passing it to another as part of our public API.

I made the AgentInfo property of the Agent private. So yes I am hiding it. What part of the Public API are we passing it to? AgentReset()? I think it is okay, it is more of a delayed event.

surfnerd · 2020-01-23T21:50:30Z

I made the AgentInfo property of the Agent private. So yes I am hiding it. What part of the Public API are we passing it to? AgentReset()? I think it is okay, it is more of a delayed event.

AgentInfo is passed to RequestDecision

ml-agents/UnitySDK/Assets/ML-Agents/Scripts/Policy/IPolicy.cs

Line 21 in 0f93186

    
           void RequestDecision(AgentInfo info, List<ISensor> sensors, Action<AgentAction> action);

AgentInfo has a public done property.

ml-agents/UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

Line 36 in 0f93186

public bool done;

chriselion

Looks good. Don't forget to update the migration guide (but that can be another PR).

vincentpierre requested review from surfnerd and chriselion January 22, 2020 01:47

vincentpierre self-assigned this Jan 22, 2020

surfnerd reviewed Jan 22, 2020

View reviewed changes

UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs Outdated Show resolved Hide resolved

chriselion reviewed Jan 22, 2020

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs Show resolved Hide resolved

surfnerd requested review from ervteng and andrewcoh January 22, 2020 18:55

Triming some of the methods of the agent but left SetReward

106e5b4

vincentpierre force-pushed the develop-trim01 branch from 6f54cac to 106e5b4 Compare January 22, 2020 20:24

surfnerd reviewed Jan 22, 2020

View reviewed changes

chriselion reviewed Jan 22, 2020

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs Show resolved Hide resolved

vincentpierre added 2 commits January 23, 2020 10:58

Fixing bugs

aa3b0ca

modifying the environments

5a66fb1

Reintroducing IsDone and IsMaxStepReached

5184238

surfnerd approved these changes Jan 23, 2020

View reviewed changes

chriselion approved these changes Jan 23, 2020

View reviewed changes

vincentpierre added 2 commits January 24, 2020 10:04

Updating the Migrating doc

5e4fce7

more details on the Migration

068ba0b

surfnerd approved these changes Jan 24, 2020

View reviewed changes

vincentpierre merged commit b8ebd41 into master Jan 24, 2020

delete-merged-branch bot deleted the develop-trim01 branch January 24, 2020 18:23

github-actions bot locked as resolved and limited conversation to collaborators May 16, 2021

	public bool IsDone()
	{
	return (bool)typeof(Agent).GetField("m_Done", BindingFlags.Instance \| BindingFlags.NonPublic).GetValue(this);
	}

Trim some public fields on the Agent #3269

Trim some public fields on the Agent #3269

Uh oh!

Conversation

vincentpierre commented Jan 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

surfnerd commented Jan 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

surfnerd commented Jan 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ervteng commented Jan 22, 2020

Uh oh!

awjuliani commented Jan 22, 2020

Uh oh!

vincentpierre commented Jan 22, 2020

Uh oh!

andrewcoh commented Jan 22, 2020

Uh oh!

ervteng commented Jan 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vincentpierre commented Jan 22, 2020

Uh oh!

andrewcoh commented Jan 22, 2020

Uh oh!

vincentpierre commented Jan 22, 2020

Uh oh!

surfnerd commented Jan 23, 2020

Uh oh!

vincentpierre commented Jan 23, 2020

Uh oh!

surfnerd commented Jan 23, 2020

Uh oh!

chriselion left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vincentpierre commented Jan 22, 2020 •

edited

Loading

surfnerd commented Jan 22, 2020 •

edited

Loading

surfnerd commented Jan 22, 2020 •

edited

Loading

ervteng commented Jan 22, 2020 •

edited

Loading