This module provides end-to-end processing of urban Points of Interest (POIs) for use in an LLM-assisted landmark exploration and riddle-generation system. It is divided into two independent submodules with clearly defined responsibilities:
- OpenStreetMap (OSM) and Overpass API: Used for extracting geospatial Points of Interest (POI) data.
- MongoDB: Used for storing processed landmark data.
- Java SpringBoot: Used for receiving and processing requests from the frontend.
- Flask: Used for handling requests in Python microservices.
- Nominatim: Used for reverse geocoding.
- Wikipedia API: Used for retrieving textual information about landmarks.
- GPT-4 or other local models: Used for processing and summarizing data obtained from external sources.
- Python Libraries:
requests: Used for HTTP requests.pymongo: Used for interacting with MongoDB.flask: Used for building Python microservices.
This module uses environment variables to configure its connection to external services. The following variables should be defined in a .env file at the root of the project:
MONGO_URL: The URL for connecting to the MongoDB instance. Default ismongodb://localhost:27017.MONGO_DB: The name of the MongoDB database to use. Default isscavengerhunt.OPENAI_API_KEY: The API key for accessing OpenAI services. This should be set to your actual OpenAI API key.
To set up the environment, create a .env file in the root directory of the project and add the above variables with your specific values. The application will automatically load these configurations at runtime.
The module does not directly expose any API endpoints. However, it interacts with several external APIs and services:
- Overpass API: Used to query OpenStreetMap for geospatial data.
- Wikipedia API: Accessed to retrieve textual information about landmarks.
- OpenAI API: Utilized for processing and summarizing data using LLMs.
- Environment Variables Not Loaded: Ensure that the
.envfile is correctly placed in the root directory and contains all necessary variables. Useload_dotenv()to load these variables at runtime. - API Key Issues: Verify that the
OPENAI_API_KEYis correctly set in the environment variables. If the key is missing or incorrect, the module will not be able to access OpenAI services. - Database Connection Errors: Check the
MONGO_URLandMONGO_DBenvironment variables to ensure they point to the correct MongoDB instance and database. Ensure MongoDB is running and accessible. - HTTP Request Failures: If requests to external APIs fail, check network connectivity and ensure the API endpoints are correct and accessible.
Purpose: Extracts and filters geospatial POI data from OpenStreetMap (OSM) using Overpass API. It computes location centroids and stores clean landmark entries in MongoDB.
Key Features:
-
Query-based extraction of candidate landmarks from OSM (e.g., buildings, galleries, parks)
-
Filters out non-relevant categories (e.g., parking lots)
-
Computes centroid coordinates from polygon geometries
-
Normalizes and structures the result into:
{ "name": "Boole Library", "latitude": 51.89, "longitude": -8.49, "tags": { ... } } -
Writes directly to the
landmarkscollection in MongoDB for use by the game backend
Purpose: Enriches raw landmarks with structured semantic metadata by retrieving supplementary content from external sources and summarizing them using LLMs.
Key Features:
-
Attempts to locate open textual information for each landmark:
- Wikipedia summaries
- Official institutional descriptions
- Cultural and architectural context
-
Uses LLMs (e.g., GPT-4 or local models) to process and summarize the retrieved data
-
Produces structured metadata fields for each landmark:
{ "landmarkId": "abc123", "metadata": { "history": "...", "architecture": "...", "functions": "...", "keywords": ["library", "UCC", "modernism"] } } -
Stores results into a new
landmark_metadatacollection or as an embeddedmetafield in thelandmarkscollection
Purpose: Automatically initializes city-specific landmark data for any given GPS coordinate at runtime, ensuring data completeness and backend compatibility.
Workflow:
- A Java SpringBoot controller receives lat/lng via
/api/game/init-game - It delegates to
GameDataRepository.initLandmarkDataFromPosition(lat, lng), which sends a POST request to Flask endpoint/fetch-landmark - The Flask service performs reverse geocoding via Nominatim to resolve the city
- It checks if landmarks for that city already exist in MongoDB. If the number of entries is above a defined threshold (e.g., 20), it skips fetching
- Otherwise, it triggers a full Overpass query + landmark extraction + MongoDB insert
- Only after completion does it return the resolved city name back to the Java backend, allowing the game controller to proceed safely
Advantages:
- Ensures consistent runtime support for any playable city
- Avoids redundant fetches by checking existing entry counts
- Eliminates race conditions or partial data loads through strict blocking behavior
A synchronous cross-service pipeline has been implemented:
-
Java backend receives coordinates from front-end client
-
Coordinates are sent to Python microservice (
/fetch-landmark) -
Python service:
- Reverse geocodes location into city name
- Checks if landmark data is already populated
- If not, uses
LandmarkPreprocessorto pull fresh data from OSM - Saves cleaned data to MongoDB
-
City name is returned only after MongoDB writes complete, ensuring consistency
This enables per-location initialization of any city without pre-loading the database manually.
- For each landmark in the
landmarkscollection, the module attempts to fetch a Wikipedia page viawikipedia.page()usingauto_suggest=True - Handles disambiguation errors and page absence with fallback search via
wikipedia.search()and a re-ranking loop - All retrieved content is passed through a GPT-based semantic verification step
def _aiInspection(landmark_name, city, wiki_text):
prompt = f"""
You are verifying if a Wikipedia article is about a specific landmark.
Target Landmark: "{landmark_name}"
City: "{city}"
Text:
\"\"\"
{wiki_text}
\"\"\"
If this page is clearly about the target landmark, respond with only: `true`. Otherwise, respond with `false`.
"""
# Submit to GPT (gpt-4, temperature=0.2)
return gpt_response.strip().lower().startswith("true")- Entries failing verification are excluded from downstream processing
- Extracts the top 5
.jpg,.jpeg, or.pngimage URLs from the Wikipedia page - These image links are embedded into a
gpt-4-turboprompt using OpenAI's vision input format - Visual data assists GPT in architectural or contextual reasoning
-
GPT receives a prompt asking for structured metadata including:
history: 5-10 keywordsarchitecture: 5-10 keywordsfunctions: 5-10 keywords
-
The result is expected in strict JSON format
-
If GPT indicates uncertainty, the result is discarded and recorded with a fallback marker
All processed metadata is written to the landmark_metadata collection in MongoDB:
{
"landmarkId": "<_id>",
"name": "Boole Library",
"city": "Cork",
"meta": {
"url": "...",
"images": [...],
"wikipedia": "...",
"description": {
"history": [...],
"architecture": [...],
"functions": [...]
}
}
}This module implements a lightweight RAG pipeline with the following stages:
- Retrieve: External information (Wikipedia page + image URLs)
- Verify: GPT-based semantic alignment check
- Generate: Vision-enabled GPT-based summarization
Benefits:
- Higher-quality semantic metadata
- Traceability and debuggability of source data
- A reusable structure suitable for adaptive riddle generation
Next Steps:
- Integrate
landmark_metadatainto the Java-sidePuzzleManageras the primary input for riddle generation - Use
descriptionfields to support difficulty estimation or thematic puzzle selection - Add caching or retry logic for GPT requests to reduce cost and prevent data loss
- Extend support for multilingual metadata output
- Introduce a scheduler to backfill metadata for any newly inserted landmarks on demand
-
Duplicate Landmark Prevention: The system currently performs multiple scans of the same target, causing database duplicates. Implement duplicate detection logic based on landmark name and city before inserting new entries to avoid redundant data storage.
-
Geographic Region Matching: Address city name mismatches where reverse geocoding returns district names (e.g., "天河区") that don't match with broader city names in the database (e.g., "广州市"). Implement a hierarchical location mapping system to normalize district-level results to their parent city for consistent database queries.
Summary:
Completed end-to-end integration of the Landmark Metadata generation pipeline with the Java backend, ensuring on-demand enrichment of landmarks before gameplay.
Details:
-
Java → Python Metadata Flow
- Implemented batch invocation of the Flask
/generate-landmark-metaendpoint inLandmarkManager.ensureLandmarkMeta(). - Java sends a list of landmark IDs filtered by the current search radius.
- Python checks the
landmark_metadatacollection for existing entries and only generates metadata for missing IDs.
- Implemented batch invocation of the Flask
-
Python Endpoint Enhancements
- Fixed collection name mismatch, standardizing all reads/writes to
landmark_metadata. - Updated
loadLandmarksFromDB()to performObjectIdconversion for precise_idmatching in thelandmarkscollection. - Added filtering for empty or invalid IDs to prevent 500 errors.
- Fixed collection name mismatch, standardizing all reads/writes to
-
End-to-End Validation
- Tested with two Cork city landmarks:
6895327b04e4917e0d875698and6895327b04e4917e0d875697. - Successfully triggered Wikipedia + GPT processing and wrote results to the
landmark_metadatacollection. - Created a Bash script to batch-delete meta records for given
landmarkIds to facilitate repeatable testing.
- Tested with two Cork city landmarks:
-
Current Impact
- All new landmarks are enriched with metadata automatically before the first puzzle is generated.
- Landmarks with existing metadata are skipped, reducing API calls and costs.
- Structured metadata is now ready to be consumed by
PuzzleAgentfor context-aware riddle generation.
