@@ -149,8 +149,6 @@ A `DecisionSteps` has the following fields :
149
149
` env.step() ` ).
150
150
- ` reward ` is a float vector of length batch size. Corresponds to the
151
151
rewards collected by each agent since the last simulation step.
152
- - ` done ` is an array of booleans of length batch size. Is true if the
153
- associated Agent was terminated during the last simulation step.
154
152
- ` agent_id ` is an int vector of length batch size containing unique
155
153
identifier for the corresponding Agent. This is used to track Agents
156
154
across simulation steps.
@@ -174,8 +172,6 @@ A `DecisionStep` has the following fields:
174
172
(Each array has one less dimension than the arrays in ` DecisionSteps ` )
175
173
- ` reward ` is a float. Corresponds to the rewards collected by the agent
176
174
since the last simulation step.
177
- - ` done ` is a bool. Is true if the Agent was terminated during the last
178
- simulation step.
179
175
- ` agent_id ` is an int and an unique identifier for the corresponding Agent.
180
176
- ` action_mask ` is an optional list of one dimensional array of booleans.
181
177
Only available in multi-discrete action space type.
@@ -197,8 +193,6 @@ A `TerminalSteps` has the following fields :
197
193
` env.step() ` ).
198
194
- ` reward ` is a float vector of length batch size. Corresponds to the
199
195
rewards collected by each agent since the last simulation step.
200
- - ` done ` is an array of booleans of length batch size. Is true if the
201
- associated Agent was terminated during the last simulation step.
202
196
- ` agent_id ` is an int vector of length batch size containing unique
203
197
identifier for the corresponding Agent. This is used to track Agents
204
198
across simulation steps.
@@ -219,8 +213,6 @@ A `TerminalStep` has the following fields:
219
213
(Each array has one less dimension than the arrays in ` TerminalSteps ` )
220
214
- ` reward ` is a float. Corresponds to the rewards collected by the agent
221
215
since the last simulation step.
222
- - ` done ` is a bool. Is true if the Agent was terminated during the last
223
- simulation step.
224
216
- ` agent_id ` is an int and an unique identifier for the corresponding Agent.
225
217
- ` max_step ` is a bool. Is true if the Agent reached its maximum number of
226
218
steps during the last simulation step.
0 commit comments