haosulab · StoneT2000 · Aug 9, 2024 · Aug 9, 2024 · Aug 9, 2024 · Aug 9, 2024
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
@@ -1,32 +1,17 @@
 # .readthedocs.yaml
 # Read the Docs configuration file
-# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
 
 # Required
 version: 2
 
-# Set the OS, Python version and other tools you might need
 build:
   os: ubuntu-22.04
   tools:
     python: "3.9"
-    # You can also specify other tool versions:
-    # nodejs: "19"
-    # rust: "1.64"
-    # golang: "1.19"
-
-# Build documentation in the "docs/" directory with Sphinx
 sphinx:
   configuration: docs/source/conf.py
 
-# Optionally build your docs in additional formats such as PDF and ePub
-# formats:
-#    - pdf
-#    - epub
-
-# Optional but recommended, declare the Python requirements required
-# to build your documentation
-# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
 python:
    install:
-   - requirements: docs/requirements.txt
+    - method: pip
+      path: .
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -1,6 +1,6 @@
 import os
 import sys
-sys.path.insert(0, os.path.abspath("../../mani_skill"))
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "../../mani_skill"))
 __version__ = "3.0.0b5"
 # Configuration file for the Sphinx documentation builder.
 #

diff --git a/docs/source/robots/index.md b/docs/source/robots/index.md
@@ -18,4 +18,5 @@ quadrupeds/index
 arms/index
 humanoids/index
 dextrous_hands/index
+other/index
 ```
diff --git a/docs/source/robots/other/index.md b/docs/source/robots/other/index.md
@@ -2,7 +2,7 @@
 
 Robots that don't fit into other standard categories (e.g. floating grippers)
 
-## Fetch Robot
+## Panda Gripper
 
 ```{figure} ../images/floating_panda_gripper.png
 ```

diff --git a/docs/source/user_guide/additional_resources/education.md b/docs/source/user_guide/additional_resources/education.md
diff --git a/docs/source/user_guide/additional_resources/performance_benchmarking.md b/docs/source/user_guide/additional_resources/performance_benchmarking.md
@@ -8,23 +8,27 @@ Currently we just compare ManiSkill and [IsaacLab](https://github.com/isaac-sim/
 
 Raw benchmark results can be read from the .csv files in the [results folder on GitHub](https://github.com/haosulab/ManiSkill/blob/main/docs/source/user_guide/additional_resources/benchmarking_results). There are also plotted figures in that folder. Below we show a selection of some of the figures/results from testing on a RTX 3080. The figures are also sometimes annotated with the GPU memory usage in GB or the number of parallel environments used for that result.
 
-*Note IsaacLab currently does not support RGB+Depth, or multiple cameras per sub-scene so there may not be results for IsaacLab on some figures
+*Note IsaacLab currently does not support RGB+Depth, or multiple cameras per sub-scene so there may not be results for IsaacLab on some figures.
 
 ### Cartpole Balance
 
 #### State
 
+CartPoleBalance simulation only performance results showing FPS vs number of environments, annotated by GPU memory usage in GB on top of data points.
 :::{figure} benchmarking_results/rtx_3080/fps:num_envs_state.png
 :::
 
 #### RGB
 
+CartPoleBalance sim+rendering performance results showing FPS vs number of environments, annotated by GPU memory usage in GB on top of data points.
 :::{figure} benchmarking_results/rtx_3080/fps:num_envs_1x256x256_rgb.png
 :::
 
+CartPoleBalance sim+rendering performance results showing FPS vs number of cameras, annotated by max number of parallel environments runnable under 16GB of GPU memory.
 :::{figure} benchmarking_results/rtx_3080/fps:num_cameras_rgb.png
 :::
 
+CartPoleBalance sim+rendering performance results showing FPS vs camera size, annotated by max number of parallel environments runnable under 16GB of GPU memory.
 :::{figure} benchmarking_results/rtx_3080/fps:camera_size_rgb.png
 :::
 

diff --git a/docs/source/user_guide/algorithms_and_models/baselines.md b/docs/source/user_guide/algorithms_and_models/baselines.md
diff --git a/docs/source/user_guide/algorithms_and_models/index.md b/docs/source/user_guide/algorithms_and_models/index.md
diff --git a/docs/source/user_guide/benchmark/online_leaderboard.md b/docs/source/user_guide/benchmark/online_leaderboard.md
diff --git a/docs/source/user_guide/index.md b/docs/source/user_guide/index.md
@@ -42,6 +42,7 @@ concepts/index
 datasets/index
 data_collection/index
 reinforcement_learning/index
+learning_from_demos/index
 ```
 
 ```{toctree}

diff --git a/...de/workflows/learning_from_demos/index.md → ...er_guide/learning_from_demos/baselines.md b/...de/workflows/learning_from_demos/index.md → ...er_guide/learning_from_demos/baselines.md
@@ -1,15 +1,14 @@
-# Learning from Demonstrations / Imitation Learning
+# Baselines
 
-We provide a number of different baselines spanning different categories of learning from demonstrations research: Behavior Cloning / Supervised Learning, Offline Reinforcement Learning, and Online Learning from Demonstrations.
+We provide a number of different baselines spanning different categories of learning from demonstrations research: Behavior Cloning / Supervised Learning, Offline Reinforcement Learning, and Online Learning from Demonstrations. This page is still a WIP as we finish running experiments and establish clear baselines and benchmarking setups.
 
-As part of these baselines we establish a few standard learning from demonstration benchmarks that cover a wide range of difficulty (easy to solve for verification but not saturated) and diversity in types of demonstrations (human collected, motion planning collected, neural net policy generated)
+<!-- As part of these baselines we establish a few standard learning from demonstration benchmarks that cover a wide range of difficulty (easy to solve for verification but not saturated) and diversity in types of demonstrations (human collected, motion planning collected, neural net policy generated) -->
 
 **Behavior Cloning Baselines**
-| Baseline                           | Code | Results |
-| ---------------------------------- | ---- | ------- |
-| Standard Behavior Cloning (BC) | WIP  | WIP     |
-| Diffusion Policy (DP)                   | WIP  | WIP     |
-| Action Chunk Transformers (ACT)    | WIP  | WIP     |
+| Baseline                        | Code | Results |
+| ------------------------------- | ---- | ------- |
+| Standard Behavior Cloning (BC)  | WIP  | WIP     |
+| Diffusion Policy (DP)           | WIP  | WIP     |
 
 
 **Online Learning from Demonstrations Baselines**

diff --git a/docs/source/user_guide/learning_from_demos/index.md b/docs/source/user_guide/learning_from_demos/index.md
@@ -0,0 +1,9 @@
+# Learning from Demonstrations / Imitation Learning
+
+ManiSkill supports all kinds of learning from demonstration methods via a unified API and provides multiple ready, already tested, baselines for use/comparison. Baseline information can be found in the [baselines page](./baselines.md).
+
+```{toctree}
+:titlesonly:
+
+baselines
+```
diff --git a/docs/source/user_guide/reference/mani_skill.envs.sapien_env.md b/docs/source/user_guide/reference/mani_skill.envs.sapien_env.md
@@ -3,5 +3,4 @@
 ```{eval-rst}  
 .. automodule:: mani_skill.envs.sapien_env
     :members:
-    :undoc-members:
 ```
diff --git a/docs/source/user_guide/reinforcement_learning/baselines.md b/docs/source/user_guide/reinforcement_learning/baselines.md
@@ -1,14 +1,14 @@
 # Baselines
 
-We provide a number of different baselines that learn from rewards. 
+We provide a number of different baselines that learn from rewards via online reinforcement learning.
 <!-- For RL baselines that leverage demonstrations see the [learning from demos section](../learning_from_demos/) -->
 
 As part of these baselines we establish standardized [reinforcement learning benchmarks](#standard-benchmark) that cover a wide range of difficulties (easy to solve for verification but not saturated) and diversity in types of robotics task, including but not limited to classic control, dextrous manipulation, table-top manipulation, mobile manipulation etc.
 
 
 ## Online Reinforcement Learning Baselines
 
-List of already implemented and tested online reinforcement learning baselines
+List of already implemented and tested online reinforcement learning baselines. Note that there are also reinforcement learning (offline RL, online imitation learning) baselines that leverage demonstrations, see the [learning from demos page](../learning_from_demos/index.md) for more information.
 
 | Baseline                                                            | Code                                                                           | Results | Paper                                    |
 | ------------------------------------------------------------------- | ------------------------------------------------------------------------------ | ------- | ---------------------------------------- |

diff --git a/docs/source/user_guide/workflows/index.md b/docs/source/user_guide/workflows/index.md
diff --git a/mani_skill/envs/sapien_env.py b/mani_skill/envs/sapien_env.py
@@ -684,16 +684,19 @@ def _load_lighting(self, options: dict):
     # Reset
     # -------------------------------------------------------------------------- #
     def reset(self, seed=None, options=None):
-        """
-        Reset the ManiSkill environment. If options["env_idx"] is given, will only reset the selected parallel environments. If
+        """Reset the ManiSkill environment. If options["env_idx"] is given, will only reset the selected parallel environments. If
         options["reconfigure"] is True, will call self._reconfigure() which deletes the entire physx scene and reconstructs everything.
         Users building custom tasks generally do not need to override this function.
 
         Returns the first observation and a info dictionary. The info dictionary is of type
-        ```
-        {
-            "reconfigure": bool (True if the environment reconfigured. False otherwise)
-        }
+
+
+        .. highlight:: python
+        .. code-block:: python
+
+            {
+                "reconfigure": bool # (True if the env reconfigured. False otherwise)
+            }