Skip to content

Conversation

@hujc7
Copy link

@hujc7 hujc7 commented Jan 17, 2026

Description

Add warp env for inhand_manipulation

Fixes # (issue)

Type of change

  • New feature (non-breaking change which adds functionality)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions bot added the isaac-lab Related to Isaac Lab team label Jan 17, 2026
hujc7 added 3 commits January 18, 2026 00:04
1. con success 10.67
2. used the same MJWarpSolver as torch env
@AntoineRichard AntoineRichard changed the title [WIP] Add Warp based inhand_manipulation env [Newton] Add Warp based inhand_manipulation env Jan 20, 2026
@hujc7
Copy link
Author

hujc7 commented Jan 20, 2026

Attaching performance data

Performance summary (torch → warp)

Metric Torch Warp Δ% (torch→warp)
Action processing mean (us, N=9600) 1456.55 36.98 -97.46%
Newton simulation mean (us, N=9600) 17591.24 19028.51 +8.17%
Post-processing mean (us, N=4800) 6522.74 171.14 -97.38%
Total step mean (us, N=4800) 45333.22 39187.63 -13.56%
Throughput (steps/s) 62826 68045 +8.31%
Iteration time (s) 1.04 0.96 -7.69%
Collection time (s) 0.801 0.699 -12.73%
Learning time (s) 0.242 0.264 +9.09%

Δ% is computed as ((\text{warp} - \text{torch}) / \text{torch} \times 100%), so negative time Δ% = less time (better).

allegro_warp.txt
allegro_torch.txt

@hujc7 hujc7 marked this pull request as ready for review January 20, 2026 18:03
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 20, 2026

Greptile Overview

Greptile Summary

This PR implements a Warp-accelerated in-hand manipulation environment for the Allegro Hand robot, enabling high-performance parallel simulation across 8192 environments using GPU kernels.

Key changes:

  • New InHandManipulationWarpEnv class with 20+ Warp kernels for parallel computation of observations, rewards, resets, and physics
  • Complete environment configuration (AllegroHandWarpEnvCfg) with Newton solver settings optimized for contact-rich manipulation
  • Gymnasium registration for Isaac-Repose-Cube-Allegro-Direct-Warp-v0 environment
  • Three RL framework configurations (RSL-RL, RL Games, SKRL) with matching hyperparameters for multi-framework compatibility
  • Task involves reorienting a cube to match a target pose using 16-DOF Allegro hand with reward shaping for position, orientation, and action penalty

Architecture:

  • Inherits from DirectRLEnvWarp base class
  • Uses Warp kernels for GPU-accelerated computations (actions, observations, rewards, resets)
  • Implements both "full" (124-dim) and "openai" observation modes
  • Includes non-finite value sanitization to prevent NaN propagation

Minor issues found:

  • Unused z_unit_vecs initialized but never referenced
  • Comment clarity could be improved in rotation_distance function

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk, pending validation that tests pass.
  • The implementation follows established patterns from the IsaacLab codebase and appears well-structured. The Warp kernels are logically sound with proper input/output separation. Minor style issues (unused variable, comment clarity) don't affect functionality. The main uncertainty is lack of test coverage mentioned in the checklist, and the PR author hasn't verified that changes generate no warnings or that tests pass.
  • Verify inhand_manipulation_warp_env.py works correctly with both observation modes and handles edge cases in quaternion operations.

Important Files Changed

Filename Overview
source/isaaclab_tasks_experimental/isaaclab_tasks_experimental/direct/inhand_manipulation/inhand_manipulation_warp_env.py Adds complete Warp-accelerated in-hand manipulation environment with 20+ GPU kernels for parallel simulation across 8192 environments.
source/isaaclab_tasks_experimental/isaaclab_tasks_experimental/direct/allegro_hand/allegro_hand_warp_env_cfg.py Configuration file defining Allegro Hand environment parameters, solver settings, and reward scales.
source/isaaclab_tasks_experimental/isaaclab_tasks_experimental/direct/allegro_hand/init.py Registers the Isaac-Repose-Cube-Allegro-Direct-Warp-v0 environment with Gymnasium and configures RL agent entry points.

Sequence Diagram

sequenceDiagram
    participant User
    participant Gym
    participant InHandManipulationWarpEnv
    participant DirectRLEnvWarp
    participant WarpKernels
    participant Hand as Articulation (Hand)
    participant Object as Articulation (Object)
    
    User->>Gym: register environment
    Gym->>InHandManipulationWarpEnv: create with AllegroHandWarpEnvCfg
    InHandManipulationWarpEnv->>DirectRLEnvWarp: __init__(cfg)
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _setup_scene()
    InHandManipulationWarpEnv->>Hand: create Articulation(robot_cfg)
    InHandManipulationWarpEnv->>Object: create Articulation(object_cfg)
    InHandManipulationWarpEnv->>WarpKernels: initialize_rng_state()
    InHandManipulationWarpEnv->>WarpKernels: initialize_goal_constants()
    InHandManipulationWarpEnv->>WarpKernels: initialize_xyz_unit_vecs()
    
    User->>InHandManipulationWarpEnv: step(actions)
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _pre_physics_step(actions)
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _apply_action()
    InHandManipulationWarpEnv->>WarpKernels: apply_actions_to_targets()
    InHandManipulationWarpEnv->>Hand: set_joint_position_target()
    InHandManipulationWarpEnv->>DirectRLEnvWarp: simulate physics
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _get_dones()
    InHandManipulationWarpEnv->>WarpKernels: compute_intermediate_values()
    InHandManipulationWarpEnv->>WarpKernels: get_dones()
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _get_observations()
    InHandManipulationWarpEnv->>WarpKernels: compute_full_observations()
    InHandManipulationWarpEnv->>WarpKernels: sanitize_and_print_once()
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _get_rewards()
    InHandManipulationWarpEnv->>WarpKernels: compute_rewards()
    InHandManipulationWarpEnv->>WarpKernels: update_consecutive_successes_from_stats()
    InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _reset_target_pose()
    InHandManipulationWarpEnv->>WarpKernels: reset_target_pose()
    
    alt Reset Required
        InHandManipulationWarpEnv->>InHandManipulationWarpEnv: _reset_idx(mask)
        InHandManipulationWarpEnv->>WarpKernels: reset_object()
        InHandManipulationWarpEnv->>Object: update root_link_pose_w
        InHandManipulationWarpEnv->>WarpKernels: reset_hand()
        InHandManipulationWarpEnv->>Hand: update joint_pos/joint_vel
        InHandManipulationWarpEnv->>WarpKernels: reset_successes()
    end
    
    InHandManipulationWarpEnv-->>User: obs, reward, done, info
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 582 to 588
def rotation_distance(object_rot: wp.quatf, target_rot: wp.quatf) -> wp.float32:
# Orientation alignment for the cube in hand and goal cube
quat_diff = quat_mul(object_rot, quat_conjugate(target_rot))
# Match Torch env convention: uses indices [1:4] for the vector part (see `rotation_distance` in Torch env).
v_norm = wp.sqrt(quat_diff[1] * quat_diff[1] + quat_diff[2] * quat_diff[2] + quat_diff[3] * quat_diff[3])
v_norm = wp.min(v_norm, wp.float32(1.0))
return wp.float32(2.0) * wp.asin(v_norm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector part indexing in comment inconsistent with implementation

The comment says "uses indices [1:4]" but the implementation correctly uses indices 1, 2, 3 (which is [1:4) in Python slicing). The comment should clarify this is the xyz components (indices 1,2,3) not including the w component (index 3 in some conventions).

Suggested change
def rotation_distance(object_rot: wp.quatf, target_rot: wp.quatf) -> wp.float32:
# Orientation alignment for the cube in hand and goal cube
quat_diff = quat_mul(object_rot, quat_conjugate(target_rot))
# Match Torch env convention: uses indices [1:4] for the vector part (see `rotation_distance` in Torch env).
v_norm = wp.sqrt(quat_diff[1] * quat_diff[1] + quat_diff[2] * quat_diff[2] + quat_diff[3] * quat_diff[3])
v_norm = wp.min(v_norm, wp.float32(1.0))
return wp.float32(2.0) * wp.asin(v_norm)
# Orientation alignment for the cube in hand and goal cube
quat_diff = quat_mul(object_rot, quat_conjugate(target_rot))
# Match Torch env convention: uses xyz components (indices 1, 2, 3) for the vector part (see `rotation_distance` in Torch env).
v_norm = wp.sqrt(quat_diff[1] * quat_diff[1] + quat_diff[2] * quat_diff[2] + quat_diff[3] * quat_diff[3])
v_norm = wp.min(v_norm, wp.float32(1.0))
return wp.float32(2.0) * wp.asin(v_norm)

# unit vectors
self.x_unit_vecs = wp.zeros(self.num_envs, dtype=wp.vec3f, device=self.device)
self.y_unit_vecs = wp.zeros(self.num_envs, dtype=wp.vec3f, device=self.device)
self.z_unit_vecs = wp.zeros(self.num_envs, dtype=wp.vec3f, device=self.device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z_unit_vecs initialized but never used

The z-axis unit vector is initialized but never referenced in any kernel or method. Consider removing it or adding a comment explaining why it's reserved for future use.

Copy link
Collaborator

@AntoineRichard AntoineRichard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits around the warp code. Otherwise it looks good to me.

Comment on lines 555 to 564
@wp.func
def quat_mul(q1: wp.quatf, q2: wp.quatf) -> wp.quatf:
# Hamilton product for quaternions in (x, y, z, w).
x1, y1, z1, w1 = q1[0], q1[1], q1[2], q1[3]
x2, y2, z2, w2 = q2[0], q2[1], q2[2], q2[3]
x = w1 * x2 + x1 * w2 + y1 * z2 - z1 * y2
y = w1 * y2 - x1 * z2 + y1 * w2 + z1 * x2
z = w1 * z2 + x1 * y2 - y1 * x2 + z1 * w2
w = w1 * w2 - x1 * x2 - y1 * y2 - z1 * z2
return wp.quatf(x, y, z, w)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already native support for that in warp. q1*q2 does it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment on lines 551 to 553
def quat_conjugate(q: wp.quatf) -> wp.quatf:
return wp.quatf(-q[0], -q[1], -q[2], q[3])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already warp support for that: wp.quat_inverse()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment on lines 567 to 573
@wp.func
def quat_from_angle_axis(angle: wp.float32, axis: wp.vec3f) -> wp.quatf:
# axis assumed to be unit-length in this task.
half = angle * wp.float32(0.5)
s = wp.sin(half)
c = wp.cos(half)
return wp.quatf(axis[0] * s, axis[1] * s, axis[2] * s, c)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a function for that: quat_from_axis_angle

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment on lines 635 to 644
wp.launch(
initialize_xyz_unit_vecs,
dim=self.num_envs,
inputs=[
self.x_unit_vecs,
self.y_unit_vecs,
self.z_unit_vecs,
],
device=self.device,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be replaced by

self.x_unit_vec = wp.vec3f(1.0,0.0,0.0)
self.y_unit_vec = wp.vec3f(0.0,1.0,0.0)
self.z_unit_vec = wp.vec3f(0.0,0.0,1.0)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The torch env do have per env unit vector for some reason. Confirming if that's unnecessary now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 130 to 131
x_unit_vecs: wp.array(dtype=wp.vec3f),
y_unit_vecs: wp.array(dtype=wp.vec3f),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These do not need to be arrays.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +346 to +347
object_pos: wp.array(dtype=wp.vec3f),
object_rot: wp.array(dtype=wp.quatf),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be fed as object_pose and a transformf. The risk is that if the tensor if not contiguous we are launching kernels to split the poses.

Comment on lines +348 to +349
object_linvel: wp.array(dtype=wp.vec3f),
object_angvel: wp.array(dtype=wp.vec3f),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be fed as a spatial vector directly. The risk is that if the tensor if not contiguous we are launching kernels to split the velocities.

Comment on lines +461 to +462
object_pos: wp.array(dtype=wp.vec3f),
object_rot: wp.array(dtype=wp.quatf),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here could be a transform directly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Albeit there is no much gain since there is no transform ops.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea. In this env, it looks like there's no direct operation on transformf. Converting to a transform type would require extracting pos and rot everywhere, which might be less convenient. Do you think it's required?

Comment on lines +463 to +464
target_pos: wp.array(dtype=wp.vec3f),
target_rot: wp.array(dtype=wp.quatf),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be a transform

Comment on lines 799 to 801
float(self.cfg.dist_reward_scale),
float(self.cfg.rot_reward_scale),
float(self.cfg.rot_eps),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this float conversion needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@hujc7
Copy link
Author

hujc7 commented Jan 22, 2026

Performance summary (allegro_torch_300_runallegro_warp_noSampleFix_300_run)

Metric Torch Warp Δ% (torch→warp)
Action processing mean (us, N=9600) 1502.42 36.77 -97.55%
Newton simulation mean (us, N=9600) 17604.23 17201.77 -2.29%
Post-processing mean (us, N=4800) 6708.68 168.72 -97.49%
Total step mean (us, N=4800) 45722.28 35529.03 -22.29%
Throughput (steps/s) 70185 85147 +21.32%
Iteration time (s) 0.93 0.77 -17.20%
Collection time (s) 0.762 0.600 -21.26%
Learning time (s) 0.171 0.170 -0.58%

Δ% is computed as (((\text{warp}-\text{torch})/\text{torch})\times 100%), so negative time Δ% = less time (better).

The potential joint sampling issue is summarized in #4404
It does seem that updating sampling, which previous included in the performance stats, puts initial configuration into a harder case such that Newton simulation takes more time. Removing the sampling fix puts warp and torch into the same condition for comparision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants