Skip to content

[IROS 2025] Official implementation of "CLAIM: Camera-LiDAR Alignment with Intensity and Monodepth" (A simple camera-lidar calibration method)

Notifications You must be signed in to change notification settings

Tompson11/claim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

CLAIM: Camera-LiDAR Alignment with Intensity and Monodepth

2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)

Zhuo Zhang , Yonghui Liu , Meijie Zhang , Feiyang Tan and Yikang Ding

Mach Drive

Paper PDF Paper PDF

Abstract

In this work, we unleash the potential of the powerful monodepth model in camera-LiDAR calibration and propose CLAIM, a novel method of aligning data from the camera and LiDAR. Given the initial guess and pairs of images and LiDAR point clouds, CLAIM utilizes a coarse-to-fine searching method to find the optimal transformation minimizing a patched Pearson correlation-based structure loss and a mutual information-based texture loss. These two losses serve as good metrics for camera-LiDAR alignment results and require no complicated steps of data processing, feature extraction, or feature matching like most methods, rendering our method simple and adaptive to most scenes. abstract

Contents

โš ๏ธ Note: The extrinsics mentioned in this page all refer to . we can use to transform a point in LiDAR coordinate to camera coordinate.


๐Ÿš€ Quick Start

๐Ÿ‘Œ 0. Prerequisites

๐Ÿ๏ธ 1. Clone & Create Environment

git clone https://github.com/Tompson11/claim.git
cd claim
conda create -n claim python=3.10 -y
conda activate claim
cd claim

๐Ÿ“ฆ 2. Install dependencies

# install torch & torchvision (Notice that the versions of torch and cuda must match, here we take cuda 11.8 as example, )
pip install torch==2.0.0 torchvision==0.15.1 --index-url https://download.pytorch.org/whl/cu118

# install requirements
pip install -r backend/requirements.txt

# install lidar_image_align
pip install -e backend --no-build-isolation

โฌ‡๏ธ 3. Download DepthAnything-V2 checkpoint

wget -O backend/model/Depth_Anything_V2/ckpt/depth_anything_v2_vitl.pth https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth\?download\=true

๐Ÿƒโ€โ™‚๏ธ 4. Try Waymo & KITTI examples

python backend/api/calibrate_offline.py --config <CONFIG_FILE> --result_name results --vis_proj

We implement 4 useful calibration modes, please replace the <CONFIG_FILE> with the corresponding config file in the following table as you like:

Mode Comment Waymo Config KITTI config
Default We add 10 $^{\circ}$ and $0.2m$ to the ground truth as the initial guess, and use the whole method in our paper to calibrate backend/config/default_waymo.json
backend/config/default_watmo_4frames.json (use 4 frames for a single calibration, i.e. CLAIM-4F* in our paper)
backend/config/default_kitti.json
backend/config/default_kitti_4frames.json (use 4 frames for a single calibration, i.e. CLAIM-4F* in our paper)
Finetune Both We add 1 $^{\circ}$ and $0.1m$ to the ground truth as the initial guess, and use the random search to calibrate backend/config/finetune_both_waymo.json backend/config/finetune_both_kitti.json
Finetune Rotation We add 1 $^{\circ}$ to the ground truth as the initial guess, and use the grid search to calibrate rotation backend/config/finetune_rotation_waymo.json backend/config/finetune_rotation_kitti.json
Finetune Translation We add $0.1m$ to the ground truth as the initial guess, and use the random search to calibrate translation backend/config/finetune_translation_waymo.json backend/config/finetune_translation_kitti.json

The final results will be saved at example_dataset/Waymo/results or example_dataset/KITTI/results with the following directory structure:

results
โ”œโ”€โ”€ analyzed_results.json
โ”œโ”€โ”€ analyzed_results.png
โ”œโ”€โ”€ results.json
โ”œโ”€โ”€ proj
โ”‚   โ”œโ”€โ”€ 00000.jpg
โ”‚   โ”œโ”€โ”€ 00010.jpg
โ”‚   โ”œโ”€โ”€ 00021.jpg
|   ...

where each file represent:

  • results.json : results of each calibration. It records the frame_id, initial guess and the final result of each calibration.
  • analyzed_results.json : statistics of results.json, including the mean, median, mode and quantile values. Usually, the mean or median value are recommended to be the final calibration result.
  • analyzed_results.png : visualization image of results.json and analyzed_results.json. It shows the histogram of each component and the correponding statistics. analyzed_results
  • proj/XXXXX.jpg : projection image of each calibration result. It shows the LiDAR depth projection (top) and LiDAR intensity projection (bottom) along with the initial guess and final calibration result. analyzed_results

๐Ÿ˜Ž Test Your Dataset

๐Ÿ’ฟ 1. Prepare dataset

Organize your dataset like the provided example dataset in example_dataset/KITTI and example_dataset/Waymo. Your dataset should follow the directory structure:

YOUR_DATASET/
โ”œโ”€โ”€ img/ # 
โ”‚   โ”œโ”€โ”€ 00000.jpg
โ”‚   โ””โ”€โ”€ 00001.jpg
|   ....
โ”œโ”€โ”€ pcd/
โ”‚   โ”œโ”€โ”€ 00000.pcd
โ”‚   โ””โ”€โ”€ 00001.pcd
|   ....                                                                     
  • Place your images under img. The image format should be either ".jpg" or ".png".
  • Place your point clouds under pcd. The point cloud format should be either ".pcd" or ".ply".
  • The image and the point cloud with the same name are considered to be a time-synchronized pair and will be used for calibration.

๐Ÿ—’๏ธ 2. Prepare configuration file

Create your own config file. You can refer to the corresponding example config according to your calibration situation.

Here, we introduce each parameter in detail:

ParameterMeaning
base_dirroot path of the dataset
frame_nums_per_batchframe numbers used for a single calibration
overlap_nums_between_batchoverlapped frame numbers between two calibrations.
e.g. If there are 5 frames in the dataset, frame_nums_per_batch=3, overlap_nums_between_batch=1, then the first calibration will use the 0, 1, 2 frame and the second one will use 2, 3, 4 frame
data_params
mono_depth_modelmodel used to estimate mono depth, currently only "depth_anything_v2" is available.
half_resolutionwhether to resize the original image to the half of its resolution. It is recommended to set "true" if your image size is larger than 1080p to save GPU memory and accelerate the calibration
points_down_sample_stepstep to sample the point cloud. A large step is necessary if the point cloud is too large.
e.g. you can set points_down_sample_step to 2 if there are 2e6 points
intensity_equalizationwhether to perform equalization on the point cloud intensity. "true" is recommended.
gray_image_equalizationwhether to perform histogram equalization on the grayscale image. "true" is recommended.
shufflewhether to shuffle the dataset. This matters if you use multi frames for calibration.
pipeline_params
mode calibration mode
0: default
1: finetune both
2: finetune rotation
3: finetune translation
patch_sizepatch size to divide the image for structure loss calculation. Usually the value of patch_size dividing the image width falls within [20, 40] will be fine.
init_rot_rangerotation search range for initial grid search (unit: degree). Set it to the estimated rotation error of the initial guess. (This parameter is useful only when the mode=0.)
init_rot_resolutionrotation search resolution for initial grid search (unit: degree). Set it to a feasible value according to the init_rot_range. (This parameter is useful only when the mode=0.)
e.g. init_rot_range=10, init_rot_resolution=1; or init_rot_range=5, init_rot_resolution=0.5
coarse_trans_rangetranslation search range for coarse random search (unit: meter). Set it to the estimated translation error of the initial guess. (This parameter is useful only when the mode=0.)
coarse_itersiterations for coarse random search. (This parameter is useful only when the mode=0.)
search_modesearch mode for finetune. (This parameter is useful when the mode=2,3.)
0: random search
1: grid search
fine_trans_rangetranslation search range for fine random search (unit: meter). (This parameter is useful when the mode=0,1,3.)
fine_rot_rangerotation search range for fine random search (unit: meter). (This parameter is useful when the mode=0,1.)
fine_itersiterations for fine random search. (This parameter is useful only when the search_mode=0.)
fine_trans_resolutiontranslation search range for finetuning grid search (unit: degree). (This parameter is useful only when the mode=3 & search_mode=1.)
fine_rot_resolutionrotation search range for finetuning grid search (unit: degree). (This parameter is useful only when the mode=2 & search_mode=1.)
intrinsicsKintrinsic value of the camera. The format must be [fx, fy, cx, cy]
Ddistortion coefficients of the camera. The format must be [k1, k2, p1, p2, k3] for pinhole or [k1, k2, k3, k4] for fisheye
extrinsicstranslationtranslation of the initial guess (unit: meter).
rotationrotation of the initial guess (unit: meter). It must be quaternion format: [qw, qx, qy, qz]

๐Ÿƒโ€โ™‚๏ธ 3. Run

python backend/api/calibrate_offline.py \
--config <CONFIG_FILE> \
--seed <SEED> \
--result_name <RESULT_NAME> \
--vis_proj

The meanings of the parameters are:

  • config: Config file.
  • seed: Random seed. You can set a non-negative seed ($0$ ~ $2^{32}-1$) for reproducible results, or a negative seed for diverse results. Default seed will be used if the parameter is omitted.
  • result_name: Name of the result directory. The final result will be saved at <YOUR_DATASET>/<RESULT_NAME>. The default value is "result".
  • vis_proj: whether to save the visualization results.

You can find the results at <YOUR_DATASET>/<RESULT_NAME>, where the content of each file can be refered to contents.


๐ŸชŸ Try with User Interface (Recommended)

๐Ÿ“ฆ 1. Install dependencies

# install requirements
pip install -r frontend/claim_frontend/requirements.txt

๐Ÿ‘พ 2. Run Django server

cd frontend/claim_frontend
python3 manage.py runserver 0.0.0.0:8080

๐Ÿ“ญ 3. Set up port forwarding (if necessary)

If you deploy CLAIM on a server without display, you need a local computer to display the user interface. So, set up port forwarding by executing the following command on your local computer, where <HOSTNAME_OF_SERVER> is the hostname of your server.

ssh -N -L 8080:127.0.0.1:8080 <HOSTNAME_OF_SERVER>

๐Ÿ–ฅ๏ธ 4. Open the User Interface

Open the browser on your local computer and enter http://127.0.0.1:8080/index/, then you will see the following page. user_interface

๐ŸŽฎ 5. Explore the User Interface

The common usage is:

  • step1 : Upload images and point clouds. Note that their numbers must be equal and the correponding pairs must have the identical index. The supported formats for images are "jpg" and "png" and for point clouds are "ply" and "pcd".
  • step2 : Set configuration parameters. Fill in the extrinsics (i.e. initial guess), intrinsics and pipeline parameters according to your situation. The meaning of these parameters can be referred to paramters.
  • step3 : Click the Submit and Calibrate! button and wait for the calibration. When the calibration completes, there will be a notice popping at the top-right of the window and the calibrated extrinsics will replace the initial guess. If you want to retrieve your initial guess or see the previous calibration results, you can click the History Results button.
  • step4 : View the projection results. Click the Project button to project LiDAR points onto the image with the current extrinsics. Switch the depth/intensity buttion to change the color attribute. Also, you can see the colored point cloud in the black windows.
  • step5 : Export the results. Click the Export button to download the current extrinsics as a json file.

Some tips:

  • Try Examples : Click the Try Examples! button and select the provided KITTI/Waymo frames. The extrinsics and intrinsics will be filled with the ground truth and you can add some perturbation on the extrinsics to test.
  • Tune Extrinsics : You can also use our user interface as a platform for manual calibration by repeatedly tuning the extrinsics and viewing the projection results. You can zoom in/out with the mousewheel when the mouse hovers on the projection picture. You can also drag the pointcloud for better visualization.

๐Ÿ“ Citations

If you find CLAIM useful in your research or projects, please cite our work:

@INPROCEEDINGS{11247484,
  author={Zhang, Zhuo and Liu, Yonghui and Zhang, Meijie and Tan, Feiyang and Ding, Yikang},
  booktitle={2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, 
  title={CLAIM: Camera-LiDAR Alignment with Intensity and Monodepth}, 
  year={2025},
  volume={},
  number={},
  pages={17921-17926},
  doi={10.1109/IROS60139.2025.11247484}}

๐Ÿ™ Acknowledgement

About

[IROS 2025] Official implementation of "CLAIM: Camera-LiDAR Alignment with Intensity and Monodepth" (A simple camera-lidar calibration method)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published