This datset has 221 aligned Vis and IR image pairs containing rich scenes such as roads, vehicles, pedestrians and so on. These images are highly representative scenes from the FLIR video. We preprocess the background thermal noise in the original IR images, accurately align the Vis and IR image pairs, and cut out the exact registration regions to form this dataset.
Some typical examples are shown below(the left column are Vis images and the right column are IR images):
If this dataset is helpful to you, please cite it as:
@inproceedings{xu2020aaai,
title={FusionDN: A Unified Densely Connected Network for Image Fusion},
author={Xu, Han and Ma, Jiayi and Le, Zhuliang and Jiang, Junjun and Guo, Xiaojie},
booktitle={proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence},
year={2020}
}
or
@article{xu2020u2fusion,
title={U2Fusion: A Unified Unsupervised Image Fusion Network},
author={Xu, Han and Ma, Jiayi and Jiang, Junjun and Guo, Xiaojie and Ling, Haibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2020},
publisher={IEEE}
}