This work uses Ditto: Building Digital Twins of Articulated Objects from Interaction to create a segmented mesh reconstruction of articulated objects, along with axis parameters for prismatic or rotational joints. Ditto needs two distinct inputs as point clouds from the object, where one input is at rest state and the other after interacting with the mobile part of the object. The scanning is performed using the HoloLens 2, joint axis result from Ditto can be seen in the HoloLens for a double check and correction. Finally, a URDF model of the object is saved.
There are different packages that need to be installed.
- Create a conda environment and install required packages.
conda env create -f conda_env_gpu.yaml -n Ditto
- Install ROS melodic.
- Create a catkin workspace and add the folders catkin_src, odio_urdf, particle_filter and articulation_model_msgs of this repository to the src folder of your catkin workspace.
- Compile the catkin workspace using catkin_make.
The unity project is located in the QRCodeXRSDX folder, the following packages were used in this project:
- hl2ss: follow the instructions in the hl2ss_test folder on this repository to correctly install it in Unity if necessary. This folder is also combined with Ditto.
- Unity-Robotics-Hub: follow the instructions of the Unity-Robotics-Hub to setup the ROS connection and to create the ROS messages.
- QR tracking with HoloLens: the tutorial is here.
-
Follow these instructions from hl2ss to create the necessary calibration folders for the HoloLens camera.
-
Follow the instructions from Ditto to download the pre-trained model and saved it into hl2ss_test/viewer/Data (You have to create the Data folder). Also make sure you Build the ConvONets dependents.
The Unity application should look as follows:

The ground truth used is from the work “Active articulation model estimation through interactive perception”, where we follow the reading of a QR code placed on the
mobile part of the object using the HoloLens.

Follow these steps to visualize the joint axis from the Ground Truth:
- Run the HoloLens application.
- Press the Scan botton placed on the right of the HoloLens application panel.
- Open a terminal on your server, source your terminal to the catkin workspace so it can work properly with ROS:
cd path_to_catkin_workspace
source devel/setup.bash
- Activate the communication between the HoloLens and the server using ROS
roslaunch ros_tcp_endpoint endpoint.launch
- Open another terminal and repeat point 3.
- Move the directory to the catkin workspace.
- Run the c++ executable that process the particle filter with:
./devel/lib/particle_filter/articulation_testing_node
- Open another terminal and repeat point 3. This starts a service that is in charge of moving the joint axis based on the results of the particle filter.
- Run the service script, which is a python script:
cd src/particle_filter/src/
python SubscriberServiceAxis.py
- Start moving the screen while making sure that the HoloLens reads the QR code on the screen.
- With this you have to see the joint axis move through the HoloLens.
- Open a terminal on your server, source your terminal to the catkin workspace so it can work properly with ROS:
cd path_to_catkin_workspace
source devel/setup.bash
- Open another two more terminals and source them too using the point 1.
- Move the directory of the last two terminals to the following directory:
cd ..
cd hl2ss_test/viewer
- Start the HoloLens application.
- Start the script that builds the point clouds and controls the sequence from the server side:
python open3d_integrator_pv_ROS.py
- Start Ditto:
python pcd_listener.py
- Place the QR code on a flat surface and press the Scan button and read the QR code.
- See the object of interest with the HoloLens, so the point cloud includes the object before we create the bounding box.
- Press the "Start Bbox scan". A window on the server will appear.
- Select corners of the object in the server side while holding the Shift key, it should look like this:

- When you have selected all points, press Q to close the window and save the bounding box.
- Press Stop Bbox scan.
- Place the object in the rest state so we can start scanning.
- Press Start First Interaction so you can record the point cloud before interaction, try to look at the object for some seconds.
- Press Stop First Interaction.
- Move the mobile part of the object so the second point cloud is different than the first one.
- Press Start Second Interaction so you can record the point cloud after interaction, try to look at the object for some seconds.
- Press Send to Ditto. The saved point clouds will be sent to Ditto to process.
- Once Ditto process is finished, you can first visualize the results on windows on the server. You can close them pressing the key Q. Once you closed all windows, the joint axis should move to the result from Ditto.
- If you want to manipulate the joint axis, press the botton Move the Axis
- Now you can move the axis with your fingers as follows:

- Now a URDF model will be saved in hl2ss_test/viewer/URDF/my_robot_Ditto.urdf
You can obtain results of URDF models like the following:

[1] Z. Jiang, C.-C. Hsu, and Y. Zhu, “Ditto: Building digital twins of articulated objects from interaction.” arXiv, 2022.
[2] K. Hausman, S. Niekum, S. Osentoski, and G. S. Sukhatme, “Active articulation model estimation through interactive perception,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 3305–3312.