Support for Asymmetric/Dictionary Observations in Brax SAC #627
Harrison-Chiu
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Brax Team,
We are an undergraduate research team from National Sun Yat-sen University in Taiwan, currently working on a robotics project as part of our mechanical engineering program. We're using Mujoco Playground to train quadruped robots and have been evaluating different reinforcement learning algorithms in this environment.
While PPO has been working well for us within Mujoco Playground, we would like to benchmark SAC under the same setup. Currently, we are integrating Brax's SAC implementation through the Playground interface. However, we noticed that Brax's SAC does not yet support two features that are essential for our use case:
Since Mujoco Playground relies on these features, it has been difficult for us to implement SAC in a comparable way to PPO. To move forward with our research, we've prototyped a modified version of the SAC training script that can handle these requirements. While our solution is functional for our use case, it's a relatively simple modification and likely not as robust or general as an official implementation.
We are interested in knowing whether support for these features is planned for Brax's SAC implementation, and if so, what the expected direction or timeline might be.
We would greatly appreciate any information or guidance you could share regarding this. If there are early-stage implementations or design plans, we would be happy to follow or assist in testing them as part of our research.
Beta Was this translation helpful? Give feedback.
All reactions