Collection of Remote Sensing Vision-Language models and papers
To add your work to this repo, feel free to submit the request or contact me at zilun.zhang@zju.edu.cn
-
EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering (2023.12) [pdf]
- Junjue Wang, Zhuo Zheng, Zihang Chen, Ailong Ma, and Yanfei Zhong
-
A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval (2023.10) [pdf]
- Jiancheng Pan, Qing Ma, Cong Bai
-
A Fine-Grained Semantic Alignment Method Specific to Aggregate Multi-Scale Information for Cross-Modal Remote Sensing Image Retrieval (2023.10) [pdf]
- Fuzhong Zheng, Xu Wang, Luyao Wang, Xiong Zhang, Hongze Zhu, Long Wang and Haisu Zhang
-
Multilanguage Transformer for Improved Text to Remote Sensing Image Retrieval (2023.10) [pdf]
- Mohamad M. Al Rahhal; Yakoub Bazi; Norah A. Alsharif; Laila Bashmal; Naif Alajlan; Farid Melgani
-
A Fusion Encoder with Multi-Task Guidance for Cross-Modal Text–Image Retrieval in Remote Sensing (2023.09) [pdf]
- Xiong Zhang, Weipeng Li , Xu Wang, Luyao Wang, Fuzhong Zheng, Long Wang and Haisu Zhang
-
Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval (2023.09) [pdf]
- Yuan Yuan, Yang Zhan, Zhitong Xiong
-
Hypersphere-based remote sensing cross-modal text–image retrieval via curriculum learning (2023.09) [pdf]
- Weihang Zhang, Jihao Li, Shuoke Li, Jialiang Chen, Wenkai Zhang, Xin Gao, Xian Sun
-
RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model (2023.06) [pdf]
- Zilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei Yin
-
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing (2023.06) [pdf]
- Fan Liu, Delong Chen, Zhangqingyun Guan, Xiaocong Zhou, Jiale Zhu, Jun Zhou
-
Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval (2023.06) [pdf]
- Jiancheng Pan, Qing Ma, Cong Bai
-
Vision-Language Models in Remote Sensing: Current Progress and Future Trends (2023.05) [pdf]
- Congcong Wen, Yuan Hu, Xiang Li, Zhenghang Yuan, Xiao Xiang Zhu
-
MCRN: A Multi-source Cross-modal Retrieval Network for remote sensing (2022.12) [pdf]
- Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Yongqiang Mao, Ruixue Zhou, Hongqi Wang, Kun Fu, Xian Sun
-
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data (2022.10) [pdf]
- Yang Zhan, Zhitong Xiong, Yuan Yuan
-
Learning to Evaluate Performance of Multi-modal Semantic Localization (2022.09) [pdf]
- Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, Zhaoying Pan, Yongqiang Mao, Jialiang Chen, Shouke Li, Hongqi Wang, Xian Sun
-
Knowledge-Aware Cross-Modal Text-Image Retrieval for Remote Sensing Images (2022.09) [pdf]
- Li Mi, Siran Li, Christel Chappuis, Devis Tuia
-
CLIP-RS: A Cross-modal Remote Sensing Image Retrieval Based on CLIP, a Northern Virginia Case Study (2022.05) [pdf]
- Djoufack Basso, Larissa
-
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval (2022.04) [pdf]
- Zhiqiang Yuan, Wenkai Zhang, Kun Fu, Xuan Li, Chubo Deng, Hongqi Wang, Xian Sun
-
Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information (2022.04) [pdf]
- Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Xuee Rong, Zhengyuan Zhang, Hongqi Wang, Kun Fu, Xian Sun
-
Fine tuning CLIP with Remote Sensing (Satellite) images and captions (2021.10) [pdf]
- Arto, Dev Vidhani, Goutham, Mayank Bhaskar, Sujit Pal