New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth for ShadowHandEnv #4066

matthewtrepte · 2025-11-22T00:43:30Z

Description

Credit to Wayne Huang + RT team for enabling these features

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (existing functionality will not work without user modification)
Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

greptile-apps · 2025-11-22T00:46:37Z

Greptile Overview

Greptile Summary

This PR adds support for three new rendering annotators (DiffuseAlbedoSD, SimpleShadingSD, and depth) to enable vision-based experiments with different input modalities in the Shadow Hand manipulation task.

Key Changes:

Registers two new custom AOV annotators (DiffuseAlbedoSD and SimpleShadingSD) in TiledCamera that output 4-channel data, processed down to 3-channel RGB
Adds 5 new gym environment variants for Shadow Hand vision task (segmentation, RGB, diffuse albedo, simple shading, depth)
Refactors FeatureExtractor to accept flexible input types (RGB, depth, segmentation, albedo, or simple shading) with configurable channel count
Updates rendering mode kit files to control color rendering via rtx.sdg.force.disableColorRender setting
Increases camera resolution from 120x120 to 240x240

Issues:

Large sections of CNN training and feature extraction code are commented out in shadow_hand_vision_env.py (lines 244-294), making the environments non-functional for actual vision-based learning
Observations now use raw object_pose instead of learned embeddings, which may not align with the vision-based learning intent
Buffer initialization pattern creates 4-channel tensors then immediately slices to 3 channels (inefficient)

Confidence Score: 2/5

This PR has significant concerns due to commented-out functionality that makes the new environments non-functional
Score reflects that while the annotator registration code is correctly implemented, the shadow_hand environment integration has extensive commented-out code that disables the CNN feature extraction, making the new vision-based environments incomplete. The observations now use raw object_pose instead of learned embeddings, which defeats the purpose of vision-based learning. These issues need resolution before merge.
Pay close attention to source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/shadow_hand_vision_env.py - the commented-out code needs to be either removed or properly integrated

Important Files Changed

File Analysis

Filename	Score	Overview
source/isaaclab/isaaclab/sensors/camera/tiled_camera.py	3/5	Adds registration and handling for `DiffuseAlbedoSD` and `SimpleShadingSD` annotators, with proper channel processing
source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/feature_extractor.py	2/5	Refactors feature extractor to support multiple input types with conditional logic, but has commented-out CNN training code in environment
source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/shadow_hand_vision_env.py	1/5	Adds new environment variants for different annotators but has extensive commented-out feature extractor integration code

Sequence Diagram

sequenceDiagram
    participant Env as ShadowHandVisionEnv
    participant Camera as TiledCamera
    participant Registry as AnnotatorRegistry
    participant Annotator as Custom Annotator
    participant Extractor as FeatureExtractor
    
    Note over Camera: Initialization
    Camera->>Registry: register_annotator_from_aov("DiffuseAlbedoSD")
    Camera->>Registry: register_annotator_from_aov("SimpleShadingSD")
    Camera->>Registry: get_annotator("DiffuseAlbedoSD"/"SimpleShadingSD")
    Registry-->>Camera: annotator instance
    Camera->>Annotator: attach(render_product_paths)
    
    Note over Env: Step Execution
    Env->>Camera: update(dt)
    Camera->>Annotator: get_data()
    Annotator-->>Camera: raw 4-channel RGBA data
    Note over Camera: Process channels
    Camera->>Camera: Convert to 4-channel uint8
    Camera->>Camera: Keep only first 3 channels (RGB)
    Camera->>Camera: Reshape to tiled format
    Camera-->>Env: data.output["diffuse_albedo"/"simple_shading"]
    
    Env->>Extractor: _preprocess_images(albedo_img/simple_shading_img)
    Extractor->>Extractor: Normalize to [0, 1]
    Extractor-->>Env: preprocessed images
    
    alt write_image_to_file enabled
        Env->>Extractor: _save_images()
        Extractor->>Extractor: Save to disk
    end
    
    Note over Env: Feature extraction disabled<br/>(commented out in current PR)

greptile-apps

_{7 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2025-11-22T00:46:35Z

source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/shadow_hand_vision_env.py

+        # train CNN to regress on keypoint positions
+        # if (
+        #     "rgb" in self.cfg.tiled_camera.data_types
+        #     and "depth" in self.cfg.tiled_camera.data_types
+        #     and "semantic_segmentation" in self.cfg.tiled_camera.data_types
+        # ):
+        #     pose_loss, embeddings = self.feature_extractor.step(
+        #         self._tiled_camera.data.output["rgb"],
+        #         self._tiled_camera.data.output["depth"],
+        #         self._tiled_camera.data.output["semantic_segmentation"][..., :3],
+        #         object_pose,
+        #     )
+        # elif "rgb" in self.cfg.tiled_camera.data_types:
+        #     pose_loss, embeddings = self.feature_extractor.step(
+        #         rgb_img=self._tiled_camera.data.output["rgb"], gt_pose=object_pose
+        #     )
+        # elif "depth" in self.cfg.tiled_camera.data_types:
+        #     pose_loss, embeddings = self.feature_extractor.step(
+        #         depth_img=self._tiled_camera.data.output["depth"], gt_pose=object_pose
+        #     )
+        # elif "semantic_segmentation" in self.cfg.tiled_camera.data_types:
+        #     pose_loss, embeddings = self.feature_extractor.step(
+        #         segmentation_img=self._tiled_camera.data.output["semantic_segmentation"][..., :3], gt_pose=object_pose
+        #     )
+        # elif "diffuse_albedo" in self.cfg.tiled_camera.data_types:
+        #     pose_loss, embeddings = self.feature_extractor.step(
+        #         albedo_img=self._tiled_camera.data.output["diffuse_albedo"], gt_pose=object_pose
+        #     )
+        # elif "simple_shading" in self.cfg.tiled_camera.data_types:
+        #     pose_loss, embeddings = self.feature_extractor.step(
+        #         simple_shading_img=self._tiled_camera.data.output["simple_shading"], gt_pose=object_pose
+        #     )
+
+        # self.embeddings = embeddings.clone().detach()
+        # # compute keypoints for goal cube
+        # compute_keypoints(
+        #     pose=torch.cat((torch.zeros_like(self.goal_pos), self.goal_rot), dim=-1), out=self.goal_keypoints
+        # )

        obs = torch.cat(
            (
-                self.embeddings,
+                # self.embeddings,
+                object_pose,
                self.goal_keypoints.view(-1, 24),
            ),
            dim=-1,
        )
-
        # log pose loss from CNN training
-        if "log" not in self.extras:
-            self.extras["log"] = dict()
-        self.extras["log"]["pose_loss"] = pose_loss
+        # if "log" not in self.extras:
+        #     self.extras["log"] = dict()
+        # self.extras["log"]["pose_loss"] = pose_loss


style: Large block of commented-out code that handles CNN training and feature extraction. Either remove if no longer needed, or re-enable if this functionality is required

greptile-apps · 2025-11-22T00:46:36Z

source/isaaclab/isaaclab/sensors/camera/tiled_camera.py

+        if "diffuse_albedo" in self.cfg.data_types:
+            data_dict["diffuse_albedo"] = torch.zeros(
+                (self._view.count, self.cfg.height, self.cfg.width, 4), device=self.device, dtype=torch.uint8
+            ).contiguous()
+            data_dict["diffuse_albedo"] = data_dict["diffuse_albedo"][..., :3]


style: Buffer is created with 4 channels, then immediately sliced to 3 channels. Consider initializing with 3 channels directly for clarity

Suggested change

if "diffuse_albedo" in self.cfg.data_types:

data_dict["diffuse_albedo"] = torch.zeros(

(self._view.count, self.cfg.height, self.cfg.width, 4), device=self.device, dtype=torch.uint8

).contiguous()

data_dict["diffuse_albedo"] = data_dict["diffuse_albedo"][..., :3]

if "diffuse_albedo" in self.cfg.data_types:

data_dict["diffuse_albedo"] = torch.zeros(

(self._view.count, self.cfg.height, self.cfg.width, 3), device=self.device, dtype=torch.uint8

).contiguous()

greptile-apps · 2025-11-22T00:46:36Z

source/isaaclab/isaaclab/sensors/camera/tiled_camera.py

+        if "simple_shading" in self.cfg.data_types:
+            data_dict["simple_shading"] = torch.zeros(
+                (self._view.count, self.cfg.height, self.cfg.width, 4), device=self.device, dtype=torch.uint8
+            ).contiguous()
+            data_dict["simple_shading"] = data_dict["simple_shading"][..., :3]


style: Buffer is created with 4 channels, then immediately sliced to 3 channels. Consider initializing with 3 channels directly for clarity

Suggested change

if "simple_shading" in self.cfg.data_types:

data_dict["simple_shading"] = torch.zeros(

(self._view.count, self.cfg.height, self.cfg.width, 4), device=self.device, dtype=torch.uint8

).contiguous()

data_dict["simple_shading"] = data_dict["simple_shading"][..., :3]

if "simple_shading" in self.cfg.data_types:

data_dict["simple_shading"] = torch.zeros(

(self._view.count, self.cfg.height, self.cfg.width, 3), device=self.device, dtype=torch.uint8

).contiguous()

greptile-apps · 2025-11-22T00:46:37Z

source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/shadow_hand_vision_env.py

        obs = torch.cat(
            (
-                self.embeddings,
+                # self.embeddings,
+                object_pose,
                self.goal_keypoints.view(-1, 24),
            ),
            dim=-1,
        )


style: Observation now directly uses object_pose instead of learned embeddings. Verify that this matches the intended observation space and algorithm expectations

Mayankm96 · 2025-11-23T15:39:48Z

apps/rendering_modes/balanced.kit

@@ -1,3 +1,5 @@
+rtx.sdg.force.disableColorRender=false


Should this be set to False only if rendering is not "rgb" in the tiledcamera? Wondering if the logic belongs here or to the camera class.

Mayankm96 · 2025-11-23T15:40:15Z

source/isaaclab/isaaclab/sensors/camera/tiled_camera.py

        )
        self._render_product_paths = [rp.path]

+        rep.AnnotatorRegistry.register_annotator_from_aov(


Please add comments on what these are and why are they needed.

Mayankm96 · 2025-11-23T15:40:49Z

source/isaaclab/isaaclab/sensors/camera/tiled_camera.py

+            aov="DiffuseAlbedoSD", output_data_type=np.uint8, output_channels=4
+        )
+        rep.AnnotatorRegistry.register_annotator_from_aov(
+            aov="SimpleShadingSD", output_data_type=np.uint8, output_channels=4


Should output channels be 3 directly? Later on slicing is done to that anyway?

Is this only for Isaac Sim 5.1?

Mayankm96 · 2025-11-23T15:41:32Z

source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/feature_extractor.py


        Returns:
-            tuple[torch.Tensor, torch.Tensor, torch.Tensor]: Preprocessed RGB, depth, and segmentation
+            tuple[torch.Tensor | None, torch.Tensor | None, torch.Tensor | None, torch.Tensor | None, torch.Tensor | None]:


We don't do type annotations on docstrings since they are harder to maintain. Please remove instances of that everywhere.

asdf

83f4b6f

matthewtrepte requested review from Mayankm96, hhansen-bdai, kellyguo11 and pascal-roth as code owners November 22, 2025 00:43

github-actions bot added isaac-sim Related to Isaac Sim team isaac-lab Related to Isaac Lab team labels Nov 22, 2025

greptile-apps bot reviewed Nov 22, 2025

View reviewed changes

matthewtrepte added 2 commits November 21, 2025 16:50

fix

ee0cc9a

adsf

8fd06bc

matthewtrepte changed the title ~~New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth~~ New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth for ShadowHandEnv Nov 22, 2025

Mayankm96 reviewed Nov 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth for ShadowHandEnv #4066

New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth for ShadowHandEnv #4066

matthewtrepte commented Nov 22, 2025 •

edited

Loading

Uh oh!

greptile-apps bot commented Nov 22, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Nov 22, 2025

Uh oh!

greptile-apps bot Nov 22, 2025

Uh oh!

greptile-apps bot Nov 22, 2025

Uh oh!

greptile-apps bot Nov 22, 2025

Uh oh!

Mayankm96 Nov 23, 2025

Uh oh!

Mayankm96 Nov 23, 2025

Uh oh!

Mayankm96 Nov 23, 2025

Uh oh!

Mayankm96 Nov 23, 2025

Uh oh!

Mayankm96 Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth for ShadowHandEnv #4066

Are you sure you want to change the base?

New Annotators for SimpleShading, DiffuseAlbedo, and Fast Depth for ShadowHandEnv #4066

Conversation

matthewtrepte commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Screenshots

Checklist

Uh oh!

greptile-apps bot commented Nov 22, 2025

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

matthewtrepte commented Nov 22, 2025 •

edited

Loading