Adds new curriculum mdp that allows modification on any environment parameters #2777

ooctipus · 2025-06-26T00:27:30Z

Description

This PR created two curriculum mdp that can change any parameter in env instance.
namely modify_term_cfg and modify_env_param.

modify_env_param is a more general version that can override any value belongs to env, but requires user to know the full path to the value.

modify_term_cfg only work with manager_term, but is a more user friendly version that simplify path specification, for example, instead of write "observation_manager.cfg.policy.joint_pos.noise", you instead write "observations.policy.joint_pos.noise", consistent with hydra overriding style

Besides path to value is needed, modify_fn, modify_params is also needed for telling the term how to modify.

Demo 1: difficulty-adaptive modification for all python native data type

# iv -> initial value, fv -> final value
def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, iv, fv, get_fraction):
    iv_, fv_ = torch.tensor(iv, device=env.device), torch.tensor(fv, device=env.device)
    fraction = eval(get_fraction)
    new_val = fraction * (fv_ - iv_) + iv_
    if isinstance(data, float):
        return new_val.item()
    elif isinstance(data, int):
        return int(new_val.item())
    elif isinstance(data, (tuple, list)):
        raw = new_val.tolist()
        # assume iv is sequence of all ints or all floats:
        is_int = isinstance(iv[0], int)
        casted = [int(x) if is_int else float(x) for x in raw]
        return tuple(casted) if isinstance(data, tuple) else casted
    else:
        raise TypeError(f"Does not support the type {type(data)}")

(float)

    joint_pos_unoise_min_adr = CurrTerm(
        func=mdp.modify_term_cfg,
        params={
            "address": "observations.policy.joint_pos.noise.n_min",
            "modify_fn": initial_final_interpolate_fn,
            "modify_params": {"iv": 0., "fv": -.1, "get_fraction": "env.command_manager.get_command("difficulty")"}
        }
    )

(tuple or list)

command_object_pose_xrange_adr = CurrTerm(
        func=mdp.modify_term_cfg,
        params={
            "address": "commands.object_pose.ranges.pos_x",
            "modify_fn": initial_final_interpolate_fn,
            "modify_params": {"iv": (-.5, -.5), "fv": (-.75, -.25), "get_fraction": "env.command_manager.get_command("difficulty")"}
        }
    )

Demo 3: overriding entire term on env_step counter rather than adaptive

def value_override(env: ManagerBasedRLEnv, env_id, data, new_val, num_steps):
    if env.common_step_counter > num_steps:
        return new_val
    return mdp.modify_term_cfg.NO_CHANGE

object_pos_curriculum = CurrTerm(
        func=mdp.modify_term_cfg,
        params={
            "address": "commands.object_pose",
            "modify_fn": value_override,
            "modify_params": {"new_val": <new_observation_term>, "num_step": 120000 }
        }
    )

Demo 4: overriding Tensor field within some arbitary class not visible from term_cfg
(you can see that 'address' is not as nice as mdp.modify_term_cfg)

def resample_bucket_range(env: ManagerBasedRLEnv, env_id, data, static_friction_range, dynamic_friction_range, restitution_range, num_steps):
    if env.common_step_counter > num_steps:
          range_list = [static_friction_range, dynamic_friction_range, restitution_range]
          ranges = torch.tensor(range_list, device="cpu")
          new_buckets = math_utils.sample_uniform(ranges[:, 0], ranges[:, 1], (len(data), 3), device="cpu")
          return new_buckets
    return mdp.modify_env_param.NO_CHANGE

object_physics_material_curriculum = CurrTerm(
        func=mdp.modify_env_param,
        params={
            "address": "event_manager.cfg.object_physics_material.func.material_buckets",
            "modify_fn": resample_bucket_range,
            "modify_params": {"static_friction_range": [.5, 1.], "dynamic_friction_range": [.3, 1.], "restitution_range": [0.0, 0.5], "num_step": 120000 }
        }
    )

Type of change

New feature (non-breaking change which adds functionality)

Checklist

I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

ooctipus · 2025-06-26T00:31:08Z

@jtigue-bdai Feel free to view and provide some feedback

jtigue-bdai

Thanks for this @ooctipus, we don't currently have tests for mdp terms but do you think you could put together a unit test for this? Because it has the potential for touching so many things I think it would be good to get some unit tests for it.

source/isaaclab/isaaclab/envs/mdp/curriculums.py

source/isaaclab/docs/CHANGELOG.rst

source/isaaclab/isaaclab/envs/mdp/curriculums.py

source/isaaclab/test/envs/test_modify_env_param_curr_term.py

jtigue-bdai

Looks good Octi, just a rogue newline.

source/isaaclab/docs/CHANGELOG.rst

…m_cfg and environment

ooctipus · 2025-07-02T07:17:11Z

@kellyguo11 Documentation ready for viz

docs/source/how-to/curriculums.rst

kellyguo11 · 2025-07-09T04:30:22Z

docs/source/how-to/curriculums.rst

+       params={
+           "term_name": "sparse_reward",
+           "weight": 0.5,
+           "num_steps": 100_000,


could we explain what the _000 means?

docs/source/how-to/curriculums.rst

source/isaaclab/docs/CHANGELOG.rst

Co-authored-by: Kelly Guo <kellyg@nvidia.com> Signed-off-by: ooctipus <zhengyuz@nvidia.com>

Signed-off-by: ooctipus <zhengyuz@nvidia.com>

Signed-off-by: Kelly Guo <kellyg@nvidia.com>

…arameters (isaac-sim#2777) # Description This PR created two curriculum mdp that can change any parameter in env instance. namely `modify_term_cfg` and `modify_env_param`. `modify_env_param` is a more general version that can override any value belongs to env, but requires user to know the full path to the value. `modify_term_cfg` only work with manager_term, but is a more user friendly version that simplify path specification, for example, instead of write "observation_manager.cfg.policy.joint_pos.noise", you instead write "observations.policy.joint_pos.noise", consistent with hydra overriding style Besides path to value is needed, modify_fn, modify_params is also needed for telling the term how to modify. Demo 1: difficulty-adaptive modification for all python native data type ``` # iv -> initial value, fv -> final value def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, iv, fv, get_fraction): iv_, fv_ = torch.tensor(iv, device=env.device), torch.tensor(fv, device=env.device) fraction = eval(get_fraction) new_val = fraction * (fv_ - iv_) + iv_ if isinstance(data, float): return new_val.item() elif isinstance(data, int): return int(new_val.item()) elif isinstance(data, (tuple, list)): raw = new_val.tolist() # assume iv is sequence of all ints or all floats: is_int = isinstance(iv[0], int) casted = [int(x) if is_int else float(x) for x in raw] return tuple(casted) if isinstance(data, tuple) else casted else: raise TypeError(f"Does not support the type {type(data)}") ``` (float) ``` joint_pos_unoise_min_adr = CurrTerm( func=mdp.modify_term_cfg, params={ "address": "observations.policy.joint_pos.noise.n_min", "modify_fn": initial_final_interpolate_fn, "modify_params": {"iv": 0., "fv": -.1, "get_fraction": "env.command_manager.get_command("difficulty")"} } ) ``` (tuple or list) ``` command_object_pose_xrange_adr = CurrTerm( func=mdp.modify_term_cfg, params={ "address": "commands.object_pose.ranges.pos_x", "modify_fn": initial_final_interpolate_fn, "modify_params": {"iv": (-.5, -.5), "fv": (-.75, -.25), "get_fraction": "env.command_manager.get_command("difficulty")"} } ) ``` Demo 3: overriding entire term on env_step counter rather than adaptive ``` def value_override(env: ManagerBasedRLEnv, env_id, data, new_val, num_steps): if env.common_step_counter > num_steps: return new_val return mdp.modify_term_cfg.NO_CHANGE object_pos_curriculum = CurrTerm( func=mdp.modify_term_cfg, params={ "address": "commands.object_pose", "modify_fn": value_override, "modify_params": {"new_val": <new_observation_term>, "num_step": 120000 } } ) ``` Demo 4: overriding Tensor field within some arbitary class not visible from term_cfg (you can see that 'address' is not as nice as mdp.modify_term_cfg) ``` def resample_bucket_range(env: ManagerBasedRLEnv, env_id, data, static_friction_range, dynamic_friction_range, restitution_range, num_steps): if env.common_step_counter > num_steps: range_list = [static_friction_range, dynamic_friction_range, restitution_range] ranges = torch.tensor(range_list, device="cpu") new_buckets = math_utils.sample_uniform(ranges[:, 0], ranges[:, 1], (len(data), 3), device="cpu") return new_buckets return mdp.modify_env_param.NO_CHANGE object_physics_material_curriculum = CurrTerm( func=mdp.modify_env_param, params={ "address": "event_manager.cfg.object_physics_material.func.material_buckets", "modify_fn": resample_bucket_range, "modify_params": {"static_friction_range": [.5, 1.], "dynamic_friction_range": [.3, 1.], "restitution_range": [0.0, 0.5], "num_step": 120000 } } ) ``` ## Type of change  - New feature (non-breaking change which adds functionality) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there  --------- Signed-off-by: ooctipus <zhengyuz@nvidia.com> Signed-off-by: Kelly Guo <kellyg@nvidia.com> Co-authored-by: Kelly Guo <kellyg@nvidia.com>

ooctipus requested review from Mayankm96, jsmith-bdai and kellyguo11 as code owners June 26, 2025 00:27

jtigue-bdai reviewed Jun 26, 2025

View reviewed changes

jtigue-bdai mentioned this pull request Jun 27, 2025

Adds modify_environment_parameter to curriculums #2696

Closed

6 tasks

ooctipus requested a review from pascal-roth as a code owner June 27, 2025 19:34

ooctipus force-pushed the feat/modify_env_param_curriculum branch from e28803f to 8957e93 Compare June 27, 2025 19:35

jtigue-bdai reviewed Jun 27, 2025

View reviewed changes

source/isaaclab/test/envs/test_modify_env_param_curr_term.py Show resolved Hide resolved

ooctipus force-pushed the feat/modify_env_param_curriculum branch from 8957e93 to 60a0b87 Compare June 27, 2025 21:27

jtigue-bdai approved these changes Jun 30, 2025

View reviewed changes

source/isaaclab/docs/CHANGELOG.rst Show resolved Hide resolved

add feature: a curriculum config that can modify any parameter in ter…

fec96ca

…m_cfg and environment

ooctipus force-pushed the feat/modify_env_param_curriculum branch from 7342759 to fec96ca Compare July 2, 2025 06:47

kellyguo11 reviewed Jul 9, 2025

View reviewed changes

ooctipus and others added 4 commits July 8, 2025 23:54

Update docs/source/how-to/curriculums.rst

c107bdb

Co-authored-by: Kelly Guo <kellyg@nvidia.com> Signed-off-by: ooctipus <zhengyuz@nvidia.com>

Remove 's is implicit comment as it doesn't make sense

5cd2b4d

Signed-off-by: ooctipus <zhengyuz@nvidia.com>

Merge branch 'main' into feat/modify_env_param_curriculum

ff8f956

Signed-off-by: Kelly Guo <kellyg@nvidia.com>

format

db2225e

kellyguo11 changed the title ~~Added new curriculum mdp that allows modification on any environment parameters~~ Adds new curriculum mdp that allows modification on any environment parameters Jul 10, 2025

Merge branch 'main' into feat/modify_env_param_curriculum

f159a19

Signed-off-by: Kelly Guo <kellyg@nvidia.com>

kellyguo11 merged commit cee5027 into main Jul 12, 2025
3 of 4 checks passed

kellyguo11 deleted the feat/modify_env_param_curriculum branch July 12, 2025 01:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds new curriculum mdp that allows modification on any environment parameters #2777

Adds new curriculum mdp that allows modification on any environment parameters #2777

Uh oh!

ooctipus commented Jun 26, 2025 •

edited

Loading

Uh oh!

ooctipus commented Jun 26, 2025

Uh oh!

jtigue-bdai left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jtigue-bdai left a comment

Uh oh!

Uh oh!

ooctipus commented Jul 2, 2025

Uh oh!

Uh oh!

kellyguo11 Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Adds new curriculum mdp that allows modification on any environment parameters #2777

Adds new curriculum mdp that allows modification on any environment parameters #2777

Uh oh!

Conversation

ooctipus commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Uh oh!

ooctipus commented Jun 26, 2025

Uh oh!

jtigue-bdai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jtigue-bdai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ooctipus commented Jul 2, 2025

Uh oh!

Uh oh!

kellyguo11 Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ooctipus commented Jun 26, 2025 •

edited

Loading