This repository contains the source code for the algorithm, described in this paper.
We propose a novel method for exploring the dynamics of physically based animated characters, and learning a task-agnostic action space that makes movement optimization easier. Like several previous papers, we parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets. Our novel contribution is that with our exploration data, we are able to learn the low-level policy in a generic manner and without any reference movement data. Trained once for each agent or simulation environment, the policy improves the efficiency of optimizing both trajectories and high-level policies across multiple tasks and optimization algorithms. We also contribute novel visualizations that show how using target states as actions makes optimized trajectories more robust to disturbances; this manifests as wider optima that are easy to find. Due to its simplicity and generality, our proposed approach should provide a building block that can improve a large variety of movement optimization methods and applications.
- Python 3.5 or above
- cma
- glfw
- gym
- Keras
- mujoco-py
- numpy
- opencv-python
- pandas
- Pillow
- stable-baselines
- tensorflow
More detailed requirements are specified in requirements.txt.
NaiveExplorer.pyThe script for generating the exploration data using naive explorationContactExplorer.pyThe script for generating the exploration data using the proposed contact-based exploration algorithmproduce_llcs.pyThe script for training the LLCs using the exploration dataoffline_trajectory_optimization.pyThe script for offline trajectory optimization using CMA-ESonline_trajectory_optimization.pyThe script for online trajectory optimization using a simplified version of Fixed-Depth Informed MCTS (FDI-MCTS)RL_Trainer.pyThe script for reinforcement learning using PPO or SACRL_Renderer.pyThe script for rendering policies trained using PPO or SAC
LLC.pyThe script for implementing and training state-reaching LLCsMLP.pyNeural network helper classlogger.pyThe logger script, taken from OpenAI Baselines repositoryRenderTimer.pyHelper script for helping with realtime rendering
ExplorationDataThe folder containing the exploration data generated using naive and contact-based exploration methods.ModelsThe folder containing all the LLCs for two exploration methods, four agents, and five horizon values (both in multi-target and single-target mode)
