Skip to content

Conversation

@ashwinvkNV
Copy link
Contributor

@ashwinvkNV ashwinvkNV commented Nov 19, 2025

rl-video-step-137600.mp4

Description

This PR introduces a new Gear Assembly manipulation task for sim-to-real training with the UR10e robot arm. This environment enables training policies for precise gear insertion tasks using reinforcement learning, with comprehensive sim-to-real transfer capabilities.

Summary of Changes

New Features

  • Gear Assembly Environment: Complete environment implementation for gear insertion tasks

    • Environment configuration (gear_assembly_env_cfg.py)
    • UR10e-specific joint position control configuration (joint_pos_env_cfg.py)
    • RSL-RL PPO training configuration (rsl_rl_ppo_cfg.py)
  • MDP Components: Task-specific observation, reward, termination, and event functions

    • mdp/events.py: Randomization and reset events for robust training
    • mdp/observations.py: State observation functions
    • mdp/rewards.py: Reward shaping for gear insertion
    • mdp/terminations.py: Episode termination conditions
  • Noise Models: Enhanced noise simulation for domain randomization

    • Added configurable noise models (noise_model.py, noise_cfg.py)
    • Integration with observation and action spaces for realistic sim-to-real transfer

Documentation

  • Sim-to-Real Training Walkthrough: Complete guide for training and deploying the gear assembly task
    • Step-by-step training instructions
    • Real robot deployment guidelines
    • Visual assets (GIFs and screenshots)

Core Enhancements

  • Training Script: Enhanced train.py with additional logging and configuration options
  • UR10e Robot Configuration: Updated universal_robots.py with gear assembly specific parameters
  • Reward System: Extended core reward functions in isaaclab/envs/mdp/rewards.py
  • RL Configuration: Updated RSL-RL integration (rl_cfg.py, setup.py)

Type of change

  • New feature (non-breaking change which adds functionality)
  • Documentation update

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

Usage Example

# Train the gear assembly task
python scripts/reinforcement_learning/rsl_rl/train.py \
  --task Isaac-Deploy-GearAssembly-UR10e-2F140-ROS-Inference-v0 \
  --num_envs 256 \
  --headless

# Run inference with trained policy
python scripts/reinforcement_learning/rsl_rl/play.py \
  --task Isaac-Deploy-GearAssembly-UR10e-2F140-ROS-Inference-v0 \
  --num_envs 1 \
 --checkpoint <checkpoint_path>

@github-actions github-actions bot added documentation Improvements or additions to documentation asset New asset feature or request labels Nov 19, 2025
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 19, 2025

Greptile Summary

  • Introduces complete gear assembly sim-to-real environment for UR10e with PPO/LSTM training supporting 2F-140 and 2F-85 Robotiq grippers
  • Implements class-based MDP components with pre-cached tensors for efficient batch operations including dynamic gear type randomization, keypoint-based rewards, and IK-based grasp initialization
  • Adds ResetSampledNoiseModel for domain randomization that samples noise once per episode reset rather than every step

Confidence Score: 4/5

  • Safe to merge with minor style improvements recommended
  • Well-structured implementation with comprehensive reward shaping, termination conditions, and domain randomization. Code follows IsaacLab patterns with class-based terms and proper tensor caching. Minor redundant operations in IK loop and temporary USD path workaround noted but non-critical.
  • source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/deploy/mdp/events.py has redundant joint state reads in IK loop; source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/deploy/gear_assembly/config/ur_10e/joint_pos_env_cfg.py uses temporary USD path pending bug fix

Important Files Changed

Filename Overview
source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/deploy/mdp/events.py Implements gear type randomization and IK-based robot grasp pose initialization with pre-cached tensors for efficient batch operations; IK loop reads joint state redundantly on each iteration (line 232)
source/isaaclab/isaaclab/utils/noise/noise_model.py Adds ResetSampledNoiseModel class that samples noise only during reset and applies it consistently throughout the episode
source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/deploy/gear_assembly/config/ur_10e/joint_pos_env_cfg.py UR10e-specific configuration with 2F-140 and 2F-85 gripper support, gripper-specific joint setters, and IK-based grasp pose initialization; uses temporary USD path override (line 415)

Sequence Diagram

sequenceDiagram
    participant User
    participant TrainingScript
    participant Environment
    participant GearTypeManager
    participant RobotIK
    participant PPOAgent
    participant RewardManager

    User->>TrainingScript: run train.py with task config
    TrainingScript->>Environment: create env with UR10e gear assembly config
    Environment->>GearTypeManager: initialize RandomizeGearType event
    GearTypeManager->>Environment: register as _gear_type_manager
    Environment->>Environment: setup scene with robot and 3 gear types
    
    loop Training Episodes
        Environment->>GearTypeManager: reset - randomize gear type
        GearTypeManager->>Environment: set active gear per env
        Environment->>RobotIK: SetRobotToGraspPose event
        RobotIK->>RobotIK: run IK to compute grasp pose
        RobotIK->>Environment: update robot joint positions
        Environment->>Environment: RandomizeGearsAndBasePose event
        
        loop Episode Steps
            Environment->>PPOAgent: get observation (joint pos/vel, gear shaft pose)
            PPOAgent->>Environment: return action (delta joint positions)
            Environment->>Environment: apply action and step simulation
            Environment->>RewardManager: compute keypoint distance rewards
            RewardManager->>Environment: return reward signal
            Environment->>Environment: check terminations (gear dropped, orientation)
        end
        
        Environment->>PPOAgent: collect episode data
        PPOAgent->>PPOAgent: update policy with PPO
    end
    
    TrainingScript->>User: save trained model checkpoint
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

22 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format

Comment on lines +414 to +415
# TODO: @ashwinvk: Revert to default USD after https://jirasw.nvidia.com/browse/ISIM-4733 is resolved
usd_path="omniverse://isaac-dev.ov.nvidia.com/Projects/isaac_ros_gear_insertion/ur10e_default_2f85.usd",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: temporary USD path override pending ISIM-4733 resolution - ensure this is removed after the issue is fixed

)

# Update gear_type_indices
for i in range(env.num_envs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop gets called on every step of the MDP, but the values don't change often. Is there a way to pre-allocate this and update it only when it changes?

)

# Update gear_type_indices
for i in range(env.num_envs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above:

This loop gets called on every step of the MDP, even though it seems that the values don't change often. Is there a way to pre-allocate this and update it only when it changes?

return torch.sum(out_of_limits, dim=1)


def action_rate(env: ManagerBasedRLEnv) -> torch.Tensor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does action_rate_l2 not work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iakinola23 do you think it should be okay to use action_rate_l2 here?

noise_std_type: Literal["scalar", "log"] = "scalar"
"""The type of noise standard deviation for the policy. Default is scalar."""

state_dependent_std: bool = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be a different PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This param in used in the rsl rl config for the sim to real env. Are you saying that it should be seperated into a new PR?

prim_path="{ENV_REGEX_NS}/FactoryGearBase",
# TODO: change to common isaac sim directory
spawn=sim_utils.UsdFileCfg(
usd_path="omniverse://isaac-dev.ov.nvidia.com/Isaac/Props/Factory/gear_assets/factory_gear_base/factory_gear_base.usd",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use import NUCLEUS Directory import instead write raw url

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on the sim team to upload these assets to the AWS server and I will replace it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please capture a high quality image with robot base. : )))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Does this one work?

Copy link
Contributor

@kellyguo11 kellyguo11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we try to avoid having large .gif files in the repo directly? we can upload them to the server if needed and referenced from docs.

prim_path="{ENV_REGEX_NS}/FactoryGearSmall",
# TODO: change to common isaac sim directory
spawn=sim_utils.UsdFileCfg(
usd_path="omniverse://isaac-dev.ov.nvidia.com/Isaac/Props/Factory/gear_assets/factory_gear_small/factory_gear_small.usd",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't use this path directly in code being merged to main, users will not have access to these. are the assets available on the S3 bucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, waiting on the sim team to uplaod assets and I will update before this PR is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

asset New asset feature or request documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants