-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a compatibility to an obs with multiple informations in aec env #221
Added a compatibility to an obs with multiple informations in aec env #221
Conversation
Hi @jimherefornonsense sorry for the delay, I will take a look at this tonight hopefully. Feel free to reach out on discord (link to the Farama discord here https://discord.gg/nhvKkYa6qX) or directly reach out/tag me if you'd like |
Looks like pre-commit failed so if you could run |
@elliottower Thanks for letting me know, I'm new to being a contributor. So should I open another PR once I pass all tests of pre-commit? |
No worries, I’ll just rerun the tests. And it’s always great to see new faces :) |
@@ -45,25 +45,49 @@ def _check_wrapper_params(self): | |||
def observation_space(self, agent): | |||
if self.change_obs_space_fn is None: | |||
space = self.env.observation_space(agent) | |||
modify_space = space | |||
if isinstance(modify_space, gymnasium.spaces.Dict): | |||
modify_space = modify_space["observation"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rewrite this to check if observation is in the dict, and checking the original space for conciseness:
if isinstance(space, gymnasium.spaces.Dict) and “observation” in space.keys():
space = space[“observation”]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sure, that's thoughtful!
Another thing is it would be good to test with a pettingzoo classic env with an action mask, if you could write a test for that it would be great. Or I can help out with that part if need be. My other thought is that if the observation space is being changed, the action mask should be fine, but if the action space is being changed then the observation action mask should be change as well. Granted, it’s not common to transform the action space but there is a lambda function for it so it might be worth considering. I think what would be good to do for now is have it throw a warning when you are changing the action space if there’s an action mask in the observation space. But I can add that myself if you prefer. |
@elliottower Yes, please do so if you don't mind. I think I can learn the coding standard from it :) |
Looked into it more and I think I'm going to do it a different way, because with your changes it is now impossible to use lambda_observation to change the action_mask, which is something I have actually done in the past (https://github.com/elliottower/ray/blob/4787c1b553b1ad64cbd4a45dccb4f3b3362042be/rllib/examples/self_play_with_pettingzoo.py#L186C10-L186C10, could have used the flatten_v0 wrapper though). I think it's too heavy handed to always modify the observation to get rid of the extra information or only allow them to perform lambda functions on the observation. Technically speaking, the entire dictionary is the observation, and a good example usage of the observation_lambda wrapper would actually be to remove the action_mask, and change the obs space to be a box. In order to allow users to do that and other custom behavior, we need to avoid these sorts of restrictions. I did add an example to the tests which uses the wrapper in this way, and you can also use what I said above as reference. I added tests for all of the wrappers with AEC envs and they pretty much all work without modification actually, besides obs_delay which required a check for the obs to be a dict instead of np.ndarray. Going to branch from your PR so you get credit because I think starting this code and pointing out that we should account for action masked AEC envs is a valuable contribution even if your code ends up not being used directly. |
Closing in favor of #227 (just because I don't have write permission to your fork of the repo, so I can't push my commits to this PR, simpler to just make a new one branched from this one) |
Extended support to AEC's observations of spaces.Dict.
Before, it only accepts an RGB array as the returned observation from the environment.
Now, it can take such an observation as
obs["observation": value, "action_mask": value]
.