This repository accompanies the Master's Thesis:
“Humanoid Locomotion via Primitive Composition: a Data-Driven Approach.”
The project explores robust bipedal locomotion for humanoid robots without relying on perfectly known analytical models. Instead, it composes a set of data-driven control primitives (policies) obtained via model-based reinforcement learning (MBRL). These primitives are treated as a "crowd" of candidate behaviors; a probabilistic crowdsourcing algorithm selects and blends them online inside the MuJoCo physics simulation environment.
Humanoid locomotion rollout:
demonstration.mp4
The approach targets adaptability in uncertain or dynamic environments by:
- Leveraging diverse pretrained or learned policies (primitive library)
- Maintaining uncertainty-aware state representations
- Dynamically scoring and selecting policies based on task-specific costs
- Enabling scalable extension to new tasks and control modalities
This work builds upon the TD-MPC2 framework by Nicklas Hansen, Hao Su, and Xiaolong Wang.
- Original TD-MPC2 repository.
- Policies used as primitives in this thesis were trained or adapted in a companion repository I maintain. That repository extends TD-MPC2 for Unitree H1 humanoid locomotion and contains the modified training pipeline and scripts used to produce the policies composed here. This repository focuses on the composition layer (crowdsourcing, selection logic, task integration); training is intentionally isolated for clarity and maintainability.
The TD-MPC2 components (world model, latent planning, scaling, buffers) form the backbone for generating the control primitives that are then composed by the crowdsourcing layer developed in this thesis. Proper credit to the original authors should be maintained in any derivative or extended work.
- Primitive Composition – Rather than a monolithic controller, multiple policy services compete/cooperate.
- Crowdsourcing Mechanism – A selection/composition loop evaluates expected performance under uncertainty.
- Model-Based RL Foundation – Primitives originate from TD-MPC components.
- Simulation Fidelity – MuJoCo enables realistic dynamics for rapid iteration and evaluation.
- Modularity – Clear boundaries between tasks, services, world models, and orchestration.
data_driven_legged_locomotion/
__main__.py # Entry: builds environment + runs crowdsourcing loop
TDMPC_crowdsourcing.py # High-level orchestration integrating TDMPC-like primitives
common/ # Crowdsourcing logic: services set, state space, probabilistic filtering
agents/ # Policy service wrappers + Hybrid / TDMPC agent code
tdmpc/ # World model, buffer, layers, math utils, scaling, parser
config*/ # YAML configs + pretrained checkpoints (*.pt)
tasks/ # Task-specific domains (pendulum, humanoid walking)
utils/ # Logging, CSV helpers, plotting notebook
demonstration.mp4 # Example rollout
requirements.txt # Python dependencies
Key folders:
common/: Core abstractions –Crowdsourcing,ServiceSet,StatePF,StateSpace.agents/: Service implementations, hybrid controller logic, TDMPC components.agents/tdmpc/common/: World model components, scaling, buffers, math utilities.tasks/h1_walk/: Humanoid locomotion task & cost shaping utilities.utils/: Logging utilities (logging.py), CSV parsing.
- Primitive Generation: Model-based RL (TD-MPC) produces candidate controllers with a learned world model and planning routine.
- Service Wrapping: Each controller is exposed as a service supplying action proposals and metadata.
- Crowdsourcing Loop:
- Maintain probabilistic belief/state filtering (
StatePF,StateSpace). - Score services via cost or predicted improvement.
- Compose or select an action (e.g., weighted or arg-min selection) for the environment step.
- Maintain probabilistic belief/state filtering (
- Adaptation: Service weights evolve with observed trajectories and performance.
- Evaluation: Executed in MuJoCo for physically plausible feedback.
Tested with Python 3.11 (3.10+ likely compatible).
git clone https://github.com/my-rice/Humanoid-Locomotion-via-primitive-composition.git
cd Humanoid-Locomotion-via-primitive-composition
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txtMuJoCo dependencies: ensure system packages for OpenGL / EGL / GLFW are present. For headless servers:
export MUJOCO_GL=eglRun the main crowdsourcing experiment:
python -m data_driven_legged_locomotionCurrently, environment composition and task selection are configured manually inside data_driven_legged_locomotion/__main__.py.
- Logging setup:
utils/logging.py - CSV utilities:
utils/read_csv.py
Extend logging by adding handlers or structured metrics where services or crowdsourcing decisions are executed.
Add a new primitive service:
- Implement a service wrapper in
agents/(extend existing hybrid or TDMPC base patterns). - Register it in
TDMPC_crowdsourcing.pyor the orchestration logic in__main__.py.
Add a new task:
- Create a folder
tasks/<your_task>/. - Provide MuJoCo XML (if needed) + cost/reward shaping script.
- Integrate selection logic or cost tests similar to
h1_walk/andpendulum/.
Modify selection strategy:
- Adjust or extend scoring in
common/Crowdsourcing.py.
Swap world model:
- Implement or modify components in
agents/tdmpc/common/world_model.pyand update configuration YAML.
- Stability (fall / termination rate)
- Goal achievement (task-specific success)
- Diversity utilization (entropy over selected services)
Distributed under the terms specified in LICENSE.
For questions or collaboration, open an Issue or reach the thesis author (add email / profile).
This project delivers a modular framework that composes model-based RL primitives through a probabilistic crowdsourcing mechanism, enabling adaptive humanoid locomotion in MuJoCo while remaining extensible for new tasks, controllers, and research directions.
Composable data-driven primitives + uncertainty-aware selection → robust humanoid locomotion in simulation.