nuScenes to KITTI conversion (nutonomy#153)

* Corrected comment * Clarification about zipped result file * Added copy method to box class * Added nuscenes to kitti conversion * Added typing hints * is_mini not currently working, attempting train/val * Removed test, more flexible splits * Documentation, command line arguments * Render example images * Render all images * Moved kitti to utils * Command line argument to choose whether to render images to disk * Added documentation, command-line argument and made a method public * Cleanup * add license * Renaming * Renaming * Simpler script for nuScenes to KITTI conversion * Set default path * Create output folder * Comments * Specify splits when loading result file * Workaround for mini splits * Fix bug when creating output folder * Indent jsons, set score to 0 by default * Parse score by default * Conversion from str to float * Set dummy velocity * Convert tuple to ndarray * Delete roundtrip script * Example code for visualization * Added separate render method * Close windows and limit number of images to render * Make only one dir * Removed image_sizes.json file and renamed label folder to label_2 * More explicit parameters, single split * Updated splits elsewhere * Updated table names * Remove verify method * Update table names * Use proper negation operator * Updated table names * Updated documentation * Use nuScenes category colors * Skip categories that are not part of the nuScenes detection challenge * Fix label colors * Updated comment * Simplify token to be nuscenes sample_token * Readd split to token * Remove tokens.txt * Remove extension from token * Fix bug if no scores are available, remove unused argument * Comment on KITTI parameters * Addressed review comments
tchigher · May 17, 2019 · da042f3 · da042f3
1 parent 3de8a5e
commit da042f3
Show file tree

Hide file tree

Showing 6 changed files with 863 additions and 7 deletions.
diff --git a/python-sdk/nuscenes/eval/detection/README.md b/python-sdk/nuscenes/eval/detection/README.md
@@ -26,7 +26,7 @@ Any user that cannot attend the workshop (direct or via a representative) will b
 The first nuScenes detection challenge will be held at CVPR 2019.
 **Submission opened May 6 and closes June 12**.
 To participate in the challenge, please create an account at [EvalAI](http://evalai.cloudcv.org/web/challenges/challenge-page/356).
-Then upload your results file in JSON format and provide all of the meta data if possible: method name, description, project URL and publication URL.
+Then upload your zipped result file and provide all of the meta data if possible: method name, description, project URL and publication URL.
 The leaderboard will remain private until the end of the challenge.
 Results and winners will be announced at the Workshop on Autonomous Driving ([WAD](https://sites.google.com/view/wad2019)) at [CVPR 2019](http://cvpr2019.thecvf.com/).
 Please note that this workshop is not related to the similarly named [Workshop on Autonomous Driving Beyond Single-Frame Perception](http://www.wad.ai).

diff --git a/python-sdk/nuscenes/nuscenes.py b/python-sdk/nuscenes/nuscenes.py
@@ -421,12 +421,15 @@ def __init__(self, nusc: NuScenes):
 
     @staticmethod
     def get_color(category_name: str) -> Tuple[int, int, int]:
-        """ Provides the default colors based on the category names. """
-        if category_name in ['vehicle.bicycle', 'vehicle.motorcycle']:
+        """
+        Provides the default colors based on the category names.
+        This method works for the general nuScenes categories, as well as the nuScenes detection categories.
+        """
+        if 'bicycle' in category_name or 'motorcycle' in category_name:
             return 255, 61, 99  # Red
-        elif 'vehicle' in category_name:
+        elif 'vehicle' in category_name or category_name in ['bus', 'car', 'construction_vehicle', 'trailer', 'truck']:
             return 255, 158, 0  # Orange
-        elif 'human.pedestrian' in category_name:
+        elif 'pedestrian' in category_name:
             return 0, 0, 230  # Blue
         elif 'cone' in category_name or 'barrier' in category_name:
             return 0, 0, 0  # Black

diff --git a/python-sdk/nuscenes/scripts/export_kitti.py b/python-sdk/nuscenes/scripts/export_kitti.py
@@ -0,0 +1,347 @@
+# nuScenes dev-kit.
+# Code written by Holger Caesar, 2019.
+# Licensed under the Creative Commons [see licence.txt]
+
+"""
+This script converts nuScenes data to KITTI format and KITTI results to nuScenes.
+It is used for compatibility with software that uses KITTI-style annotations.
+
+We do not encourage this, as:
+- KITTI has only front-facing cameras, whereas nuScenes has a 360 degree horizontal fov.
+- KITTI has no radar data.
+- The nuScenes database format is more modular.
+- KITTI fields like occluded and truncated cannot be exactly reproduced from nuScenes data.
+- KITTI has different categories.
+
+Limitations:
+- We don't specify the KITTI imu_to_velo_kitti projection in this code base.
+- We map nuScenes categories to nuScenes detection categories, rather than KITTI categories.
+- Attributes are not part of KITTI and therefore set to '' in the nuScenes result format.
+- Velocities are not part of KITTI and therefore set to 0 in the nuScenes result format.
+- This script uses the `train` and `val` splits of nuScenes, whereas standard KITTI has `training` and `testing` splits.
+
+This script includes three main functions:
+- nuscenes_gt_to_kitti(): Converts nuScenes GT annotations to KITTI format.
+- render_kitti(): Render the annotations of the (generated or real) KITTI dataset.
+- kitti_res_to_nuscenes(): Converts a KITTI detection result to the nuScenes detection results format.
+
+To launch these scripts run:
+- python export_kitti.py nuscenes_gt_to_kitti --nusc_kitti_dir ~/nusc_kitti
+- python export_kitti.py render_kitti --nusc_kitti_dir ~/nusc_kitti
+- python export_kitti.py kitti_res_to_nuscenes --nusc_kitti_dir ~/nusc_kitti
+
+To work with the original KITTI dataset, use these parameters:
+ --nusc_kitti_dir /data/sets/kitti --split training
+
+See https://www.nuscenes.org/object-detection for more information on the nuScenes result format.
+"""
+import os
+import json
+from typing import List, Dict, Any
+
+from pyquaternion import Quaternion
+import numpy as np
+import fire
+from PIL import Image
+import matplotlib.pyplot as plt
+
+from nuscenes.nuscenes import NuScenes
+from nuscenes.utils.geometry_utils import transform_matrix
+from nuscenes.utils.geometry_utils import BoxVisibility
+from nuscenes.utils.data_classes import LidarPointCloud, Box
+from nuscenes.utils.splits import create_splits_logs
+from nuscenes.utils.kitti import KittiDB
+from nuscenes.eval.detection.utils import category_to_detection_name
+
+
+class KittiConverter:
+    def __init__(self,
+                 nusc_kitti_dir: str = '~/nusc_kitti',
+                 cam_name: str = 'CAM_FRONT',
+                 lidar_name: str = 'LIDAR_TOP',
+                 image_count: int = 10,
+                 nusc_version: str = 'v1.0-mini',
+                 split: str = 'mini_train'):
+        """
+        :param nusc_kitti_dir: Where to write the KITTI-style annotations.
+        :param cam_name: Name of the camera to export. Note that only one camera is allowed in KITTI.
+        :param lidar_name: Name of the lidar sensor.
+        :param image_count: Number of images to convert.
+        :param nusc_version: nuScenes version to use.
+        :param split: Dataset split to use.
+        """
+        self.nusc_kitti_dir = os.path.expanduser(nusc_kitti_dir)
+        self.cam_name = cam_name
+        self.lidar_name = lidar_name
+        self.image_count = image_count
+        self.nusc_version = nusc_version
+        self.split = split
+
+        # Create nusc_kitti_dir.
+        if not os.path.isdir(self.nusc_kitti_dir):
+            os.makedirs(self.nusc_kitti_dir)
+
+        # Select subset of the data to look at.
+        self.nusc = NuScenes(version=nusc_version)
+
+    def nuscenes_gt_to_kitti(self) -> None:
+        """
+        Converts nuScenes GT annotations to KITTI format.
+        """
+        kitti_to_nu_lidar = Quaternion(axis=(0, 0, 1), angle=np.pi / 2)
+        kitti_to_nu_lidar_inv = kitti_to_nu_lidar.inverse
+
+        token_idx = 0  # Start tokens from 0.
+
+        # Get assignment of scenes to splits.
+        split_logs = create_splits_logs(self.split, self.nusc)
+
+        # Create output folders.
+        label_folder = os.path.join(self.nusc_kitti_dir, self.split, 'label_2')
+        calib_folder = os.path.join(self.nusc_kitti_dir, self.split, 'calib')
+        image_folder = os.path.join(self.nusc_kitti_dir, self.split, 'image_2')
+        lidar_folder = os.path.join(self.nusc_kitti_dir, self.split, 'velodyne')
+        for folder in [label_folder, calib_folder, image_folder, lidar_folder]:
+            if not os.path.isdir(folder):
+                os.makedirs(folder)
+
+        # Use only the samples from the current split.
+        sample_tokens = self._split_to_samples(split_logs)
+        sample_tokens = sample_tokens[:self.image_count]
+
+        tokens = []
+        for sample_token in sample_tokens:
+
+            # Get sample data.
+            sample = self.nusc.get('sample', sample_token)
+            sample_annotation_tokens = sample['anns']
+            cam_front_token = sample['data'][self.cam_name]
+            lidar_token = sample['data'][self.lidar_name]
+
+            # Retrieve sensor records.
+            sd_record_cam = self.nusc.get('sample_data', cam_front_token)
+            sd_record_lid = self.nusc.get('sample_data', lidar_token)
+            cs_record_cam = self.nusc.get('calibrated_sensor', sd_record_cam['calibrated_sensor_token'])
+            cs_record_lid = self.nusc.get('calibrated_sensor', sd_record_lid['calibrated_sensor_token'])
+
+            # Combine transformations and convert to KITTI format.
+            # Note: cam uses same conventions in KITTI and nuScenes.
+            lid_to_ego = transform_matrix(cs_record_lid['translation'], Quaternion(cs_record_lid['rotation']),
+                                          inverse=False)
+            ego_to_cam = transform_matrix(cs_record_cam['translation'], Quaternion(cs_record_cam['rotation']),
+                                          inverse=True)
+            velo_to_cam = np.dot(ego_to_cam, lid_to_ego)
+
+            # Convert from KITTI to nuScenes LIDAR coordinates, where we apply velo_to_cam.
+            velo_to_cam_kitti = np.dot(velo_to_cam, kitti_to_nu_lidar.transformation_matrix)
+
+            # Currently not used.
+            imu_to_velo_kitti = np.zeros((3, 4))  # Dummy values.
+            r0_rect = Quaternion(axis=[1, 0, 0], angle=0)  # Dummy values.
+
+            # Projection matrix.
+            p_left_kitti = np.zeros((3, 4))
+            p_left_kitti[:3, :3] = cs_record_cam['camera_intrinsic']  # Cameras are always rectified.
+
+            # Create KITTI style transforms.
+            velo_to_cam_rot = velo_to_cam_kitti[:3, :3]
+            velo_to_cam_trans = velo_to_cam_kitti[:3, 3]
+
+            # Check that the rotation has the same format as in KITTI.
+            assert (velo_to_cam_rot.round(0) == np.array([[0, -1, 0], [0, 0, -1], [1, 0, 0]])).all()
+            assert (velo_to_cam_trans[1:3] < 0).all()
+
+            # Retrieve the token from the lidar.
+            # Note that this may be confusing as the filename of the camera will include the timestamp of the lidar,
+            # not the camera.
+            filename_cam_full = sd_record_cam['filename']
+            filename_lid_full = sd_record_lid['filename']
+            # token = '%06d' % token_idx # Alternative to use KITTI names.
+            token_idx += 1
+
+            # Convert image (jpg to png).
+            src_im_path = os.path.join(self.nusc.dataroot, filename_cam_full)
+            dst_im_path = os.path.join(image_folder, sample_token + '.png')
+            if not os.path.exists(dst_im_path):
+                im = Image.open(src_im_path)
+                im.save(dst_im_path, "PNG")
+
+            # Convert lidar.
+            # Note that we are only using a single sweep, instead of the commonly used n sweeps.
+            src_lid_path = os.path.join(self.nusc.dataroot, filename_lid_full)
+            dst_lid_path = os.path.join(lidar_folder, sample_token + '.bin')
+            assert not dst_lid_path.endswith('.pcd.bin')
+            pcl = LidarPointCloud.from_file(src_lid_path)
+            pcl.rotate(kitti_to_nu_lidar_inv.rotation_matrix)  # In KITTI lidar frame.
+            with open(dst_lid_path, "w") as lid_file:
+                pcl.points.T.tofile(lid_file)
+
+            # Add to tokens.
+            tokens.append(sample_token)
+
+            # Create calibration file.
+            kitti_transforms = dict()
+            kitti_transforms['P0'] = np.zeros((3, 4))  # Dummy values.
+            kitti_transforms['P1'] = np.zeros((3, 4))  # Dummy values.
+            kitti_transforms['P2'] = p_left_kitti  # Left camera transform.
+            kitti_transforms['P3'] = np.zeros((3, 4))  # Dummy values.
+            kitti_transforms['R0_rect'] = r0_rect.rotation_matrix  # Cameras are already rectified.
+            kitti_transforms['Tr_velo_to_cam'] = np.hstack((velo_to_cam_rot, velo_to_cam_trans.reshape(3, 1)))
+            kitti_transforms['Tr_imu_to_velo'] = imu_to_velo_kitti
+            calib_path = os.path.join(calib_folder, sample_token + '.txt')
+            with open(calib_path, "w") as calib_file:
+                for (key, val) in kitti_transforms.items():
+                    val = val.flatten()
+                    val_str = '%.12e' % val[0]
+                    for v in val[1:]:
+                        val_str += ' %.12e' % v
+                    calib_file.write('%s: %s\n' % (key, val_str))
+
+            # Write label file.
+            label_path = os.path.join(label_folder, sample_token + '.txt')
+            if os.path.exists(label_path):
+                print('Skipping existing file: %s' % label_path)
+                continue
+            else:
+                print('Writing file: %s' % label_path)
+            with open(label_path, "w") as label_file:
+                for sample_annotation_token in sample_annotation_tokens:
+                    sample_annotation = self.nusc.get('sample_annotation', sample_annotation_token)
+
+                    # Get box in LIDAR frame.
+                    _, box_lidar_nusc, _ = self.nusc.get_sample_data(lidar_token, box_vis_level=BoxVisibility.NONE,
+                                                                     selected_anntokens=[sample_annotation_token])
+                    box_lidar_nusc = box_lidar_nusc[0]
+
+                    # Truncated: Set all objects to 0 which means untruncated.
+                    truncated = 0.0
+
+                    # Occluded: Hard-coded: Full visibility.
+                    occluded = 0
+
+                    # Convert nuScenes category to nuScenes detection challenge category.
+                    detection_name = category_to_detection_name(sample_annotation['category_name'])
+
+                    # Skip categories that are not part of the nuScenes detection challenge.
+                    if detection_name is None:
+                        continue
+
+                    # Convert to KITTI 3d and 2d box and KITTI output format.
+                    box_cam_kitti = KittiDB.box_nuscenes_to_kitti(
+                        box_lidar_nusc, Quaternion(matrix=velo_to_cam_rot), velo_to_cam_trans, r0_rect)
+                    box_cam_kitti.score = 0  # Set dummy score so we can use this file as result.
+                    output = KittiDB.box_to_string(name=detection_name, box=box_cam_kitti, truncation=truncated,
+                                                   occlusion=occluded)
+
+                    # Write to disk.
+                    label_file.write(output + '\n')
+
+    def render_kitti(self) -> None:
+        """
+        Renders the annotations in the KITTI dataset from a lidar and a camera view.
+        """
+        # Load the KITTI dataset.
+        kitti = KittiDB(root=self.nusc_kitti_dir, splits=(self.split,))
+
+        # Create output folder.
+        render_dir = os.path.join(self.nusc_kitti_dir, 'render')
+        if not os.path.isdir(render_dir):
+            os.mkdir(render_dir)
+
+        # Render each image.
+        for token in kitti.tokens[:self.image_count]:
+            for sensor in ['lidar', 'camera']:
+                out_path = os.path.join(render_dir, '%s_%s.png' % (token, sensor))
+                print('Rendering file to disk: %s' % out_path)
+                kitti.render_sample_data(token, sensor_modality=sensor, out_path=out_path)
+                plt.close()  # Close the windows to avoid a warning of too many open windows.
+
+    def kitti_res_to_nuscenes(self, meta: Dict[str, bool] = None) -> None:
+        """
+        Converts a KITTI detection result to the nuScenes detection results format.
+        :param meta: Meta data describing the method used to generate the result. See nuscenes.org/object-detection.
+        """
+        # Dummy meta data, please adjust accordingly.
+        if meta is None:
+            meta = {
+                'use_camera': False,
+                'use_lidar': True,
+                'use_radar': False,
+                'use_map': False,
+                'use_external': False,
+            }
+
+        # Init.
+        results = {}
+
+        # Load the KITTI dataset.
+        kitti = KittiDB(root=self.nusc_kitti_dir, splits=(self.split, ))
+
+        # Get assignment of scenes to splits.
+        split_logs = create_splits_logs(self.split, self.nusc)
+
+        # Use only the samples from the current split.
+        sample_tokens = self._split_to_samples(split_logs)
+        sample_tokens = sample_tokens[:self.image_count]
+
+        for sample_token in sample_tokens:
+            # Get the KITTI boxes we just generated in LIDAR frame.
+            kitti_token = '%s_%s' % (self.split, sample_token)
+            boxes = kitti.get_boxes(token=kitti_token)
+
+            # Convert KITTI boxes to nuScenes detection challenge result format.
+            sample_results = [self._box_to_sample_result(sample_token, box) for box in boxes]
+
+            # Store all results for this image.
+            results[sample_token] = sample_results
+
+        # Store submission file to disk.
+        submission = {
+            'meta': meta,
+            'results': results
+        }
+        submission_path = os.path.join(self.nusc_kitti_dir, 'submission.json')
+        print('Writing submission to: %s' % submission_path)
+        with open(submission_path, 'w') as f:
+            json.dump(submission, f, indent=2)
+
+    def _box_to_sample_result(self, sample_token: str, box: Box, attribute_name: str = '') -> Dict[str, Any]:
+        # Prepare data
+        translation = box.center
+        size = box.wlh
+        rotation = box.orientation.q
+        velocity = box.velocity
+        detection_name = box.name
+        detection_score = box.score
+
+        # Create result dict
+        sample_result = dict()
+        sample_result['sample_token'] = sample_token
+        sample_result['translation'] = translation.tolist()
+        sample_result['size'] = size.tolist()
+        sample_result['rotation'] = rotation.tolist()
+        sample_result['velocity'] = velocity.tolist()[:2]  # Only need vx, vy.
+        sample_result['detection_name'] = detection_name
+        sample_result['detection_score'] = detection_score
+        sample_result['attribute_name'] = attribute_name
+
+        return sample_result
+
+    def _split_to_samples(self, split_logs: List[str]) -> List[str]:
+        """
+        Convenience function to get the samples in a particular split.
+        :param split_logs: A list of the log names in this split.
+        :return: The list of samples.
+        """
+        samples = []
+        for sample in self.nusc.sample:
+            scene = self.nusc.get('scene', sample['scene_token'])
+            log = self.nusc.get('log', scene['log_token'])
+            logfile = log['logfile']
+            if logfile in split_logs:
+                samples.append(sample['token'])
+        return samples
+
+
+if __name__ == '__main__':
+    fire.Fire(KittiConverter)