-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Changes to the LL-API - Refactor of “done” logic #3681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -3,12 +3,12 @@ | |||
The aim of this API is to expose groups of similar Agents evolving in Unity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there is still a lot of discussion about "groups" here. If the concept no longer exists in the API, should we still be talking about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bit of feedback, but didn't get into the details of the Agent processing.
If you want, feel free to comment out the gym yamato test for now, until you get it working with single-agent in the other PR.
Thanks, I incorporated some of these changes. I want this to be informally approved before I made another PR targeting this one with the docs changes.
I think I will wait on #3725 to be merged and then merge master into this one to fix the missing test. |
* Edited the Documentation for the changes to the LLAPI * Forgot the CHANGELOG * Fixing a typo raised by #3731 * [skip ci] Update com.unity.ml-agents/CHANGELOG.md * [skip ci] Update docs/Migrating.md * [skip ci] Update docs/Python-API.md * [skip ci] Update docs/Python-API.md
that will share the same policy or behavior. All Agents in a group have the same goal | ||
and reward signals. | ||
An Agent "Behavior" is a group of Agents identified by a `BehaviorName` that share the same | ||
observations and action types (described in their `BehaviorSpec`). You can think about Agent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean "observation and action types" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean observations types. There can be multiple observations of different shapes.
Both `DecisionSteps` and `TerminalSteps` contain information such as | ||
the observations, the rewards and the agent identifiers. | ||
`DecisionSteps` also contains action masks for the next action while `TerminalSteps` | ||
contains the reason for termination (did the Agent reach its maximum step and was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or was ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now reaching max_step is the only way to interrupt an agent, so I meant and
. Is that reasonable ?
@chriselion @awjuliani @andrewcoh @ervteng |
com.unity.ml-agents/CHANGELOG.md
Outdated
@@ -12,6 +12,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. | |||
- The Jupyter notebooks have been removed from the repository. | |||
- Introduced the `SideChannelUtils` to register, unregister and access side channels. | |||
- `Academy.FloatProperties` was removed, please use `SideChannelUtils.GetSideChannel<FloatPropertiesChannel>()` instead. | |||
- Removed the multi-agent gym option from the gym wrapper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a line here telling people they should use the LL-API if they want multi-agent support for their custom trainers/research.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Proposed change(s)
Proposed changes to the API include:
Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)
Design Document
Brainstorm document
Jira : MLA-793
Types of change(s)
[ ] Bug fix[ ] Other (please describe)Checklist
Other comments
gym test is broken and needs to be replaced with a single agent gym environment.
Need to change documentation once this PR is approved