Skip to content

A research on the temporal-aware, multimodal retrieval-augmented system for Geospatial Question Answering

Notifications You must be signed in to change notification settings

Ethan830/GeoLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

GeoLens: A Temporal-aware, Multimodal Retrieval-Augmented System for Geospatial Question Answering

Authors: Mingyang Li, Henry Ning, Ethan Yang

Emory University Department of Computer Science, Atlanta, Georgia, USA

Abstract

Geospatial Question Answering (QA) and tourism recommendations are important due to their importance in location-based decision making and recommendation. Recent Retrieval-Augmented Generation (RAG) approaches have achieved remarkable results in geographic reasoning. However, current methods still struggle to handle multimodal signals while balancing explicit spatial and temporal constraints. This paper presents GeoLens, a multimodal RAG framework that uses hybrid retrieval and multi-objective optimization to overcome this challenge. GeoLens demonstrates superior results against recent Large Language Models (LLM), outperforming the best LLM baseline by 52.3% in Precision@1 on TourismQA-Miami, and competitive results against previous specialized system, exceeding GeoLLM by 49.5% in Recall@10 on TourismQA-NYC, with a light-weight LLM backbone. To the best of our knowledge, this is the first work to integrate multimodal signals into geospatial RAG. Our framework is useful for real-life geospatial recommendation applications with scalable cost.

About

A research on the temporal-aware, multimodal retrieval-augmented system for Geospatial Question Answering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published