Open Neural Network Exchange (ONNX) compatible implementation of LightGlue: Local Feature Matching at Light Speed. The ONNX model format allows for interoperability across different platforms with support for multiple execution providers, and removes Python-specific dependencies such as PyTorch.
- 4 July 2023: Add inference time comparisons.
- 1 July 2023: Add support for extractor
max_num_keypoints
. - 30 June 2023: Add support for DISK extractor.
- 28 June 2023: Add end-to-end SuperPoint+LightGlue export & inference pipeline.
Prior to exporting the ONNX models, please install the requirements of the original LightGlue repository. (Flash Attention does not need to be installed.)
To convert the DISK or SuperPoint and LightGlue models to ONNX, run export.py
. We provide two types of ONNX exports: individual standalone models, and a combined end-to-end pipeline (recommended for convenience) with the --end2end
flag.
python export.py \
--img_size 512 \
--extractor_type superpoint \
--extractor_path weights/superpoint.onnx \
--lightglue_path weights/superpoint_lightglue.onnx \
--dynamic
- Exporting individually can be useful when intermediate outputs can be cached or precomputed. On the other hand, the end-to-end pipeline can be more convenient.
- Although dynamic axes have been specified, it is recommended to export your own ONNX model with the appropriate input image sizes of your use case.
If you would like to try out inference right away, you can download ONNX models that have already been exported here.
With ONNX models in hand, one can perform inference on Python using ONNX Runtime (see requirements-onnx.txt).
The LightGlue inference pipeline has been encapsulated into a runner class:
from onnx_runner import LightGlueRunner, load_image, rgb_to_grayscale
image0, scales0 = load_image("assets/sacre_coeur1.jpg", resize=512)
image1, scales1 = load_image("assets/sacre_coeur2.jpg", resize=512)
image0 = rgb_to_grayscale(image0) # only needed for SuperPoint
image1 = rgb_to_grayscale(image1) # only needed for SuperPoint
# Create ONNXRuntime runner
runner = LightGlueRunner(
extractor_path="weights/superpoint.onnx",
lightglue_path="weights/superpoint_lightglue.onnx",
providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
)
# Run inference
m_kpts0, m_kpts1 = runner.run(image0, image1, scales0, scales1)
Note that the output keypoints have already been rescaled back to the original image sizes.
Alternatively, you can also run infer.py
.
python infer.py \
--img_paths assets/DSC_0410.JPG assets/DSC_0411.JPG \
--img_size 512 \
--lightglue_path weights/superpoint_lightglue.onnx \
--extractor_type superpoint \
--extractor_path weights/superpoint.onnx \
--viz
In general, for smaller numbers of keypoints the ONNX version performs similarly to the PyTorch implementation. However, as the number of keypoints increases, the PyTorch CUDA implementation is faster, whereas ONNX is faster overall for CPU inference. See EVALUATION.md for technical details.
As the ONNX Runtime has limited support for features like dynamic control flow, certain configurations of the models cannot be exported to ONNX easily. These caveats are outlined below.
- Only batch size
1
is currently supported. This limitation stems from the fact that different images in the same batch can have varying numbers of keypoints, leading to non-uniform (a.k.a. ragged) tensors.
- Since dynamic control flow has limited support in ONNX tracing, by extension, early stopping and adaptive point pruning (the
depth_confidence
andwidth_confidence
parameters) are also difficult to export to ONNX. - Flash Attention is turned off.
- Mixed precision is turned off.
- Note that the end-to-end version, despite its name, still requires the postprocessing (filtering valid matches) function outside the ONNX model since the
scales
variables need to be passed.
Additionally, the outputs of the ONNX models differ slightly from the original PyTorch models (by a small error on the magnitude of 1e-6
to 1e-5
for the scores/descriptors). Although the cause is still unclear, this could be due to differing implementations or modified dtypes.
- Support for TensorRT: Appears to be currently blocked by unsupported Einstein summation operations (
torch.einsum()
) in TensorRT - Thanks to Shidqiet's investigation. - Support for batch size > 1: Blocked by the fact that different images can have varying numbers of keypoints. Perhaps max-length padding?
- Support for dynamic control flow: Investigating script-mode ONNX export instead of trace-mode.
- Mixed-precision Support
- Quantization Support
If you use any ideas from the papers or code in this repo, please consider citing the authors of LightGlue and SuperPoint and DISK. Lastly, if the ONNX versions helped you in any way, please also consider starring this repository.
@inproceedings{lindenberger23lightglue,
author = {Philipp Lindenberger and
Paul-Edouard Sarlin and
Marc Pollefeys},
title = {{LightGlue}: Local Feature Matching at Light Speed},
booktitle = {ArXiv PrePrint},
year = {2023}
}
@article{DBLP:journals/corr/abs-1712-07629,
author = {Daniel DeTone and
Tomasz Malisiewicz and
Andrew Rabinovich},
title = {SuperPoint: Self-Supervised Interest Point Detection and Description},
journal = {CoRR},
volume = {abs/1712.07629},
year = {2017},
url = {http://arxiv.org/abs/1712.07629},
eprinttype = {arXiv},
eprint = {1712.07629},
timestamp = {Mon, 13 Aug 2018 16:47:29 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1712-07629.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@article{DBLP:journals/corr/abs-2006-13566,
author = {Michal J. Tyszkiewicz and
Pascal Fua and
Eduard Trulls},
title = {{DISK:} Learning local features with policy gradient},
journal = {CoRR},
volume = {abs/2006.13566},
year = {2020},
url = {https://arxiv.org/abs/2006.13566},
eprinttype = {arXiv},
eprint = {2006.13566},
timestamp = {Wed, 01 Jul 2020 15:21:23 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2006-13566.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}