This repository contains code and scripts to visualize text detected on maps from the David Rumsey Map Collection. To get there, we take the following steps:
- Convert georeference data from the David Rumsey Map Collection into Georeference Annotations.
- Convert the text detected with OCR produced by mapKurator into OCR Web Annotations and add the IIIF Image ID from the Georeference Annotations to this data.
- Use Allmaps to turn the pixel coordinates of the text bounding boxes into GeoJSON.
- Turn this GeoJSON into PMTiles using tippecanoe.
- Visualize this data in a web application built with SvelteKit and MapLibre GL JS .
This repository contains the following directories:
etl
: ETL scripts to transform and export the required data.- 'app`: Web application to visualize the data.
data
: Input and output data.
To run the scripts or app locally, first install the required dependencies:
pnpm install --recursive
Then, run the ETL scripts to produce the required data (or download them from Zenodo) and build the web application.
Software:
- Node.js v23.1.0 or higher
- pnpm v10.10.0 or higher
- Tippecanoe
Required input data:
./data/input/rumsey_57k_english.zip
(warning: This is a 51 GB file!)./data/input/maps.ndjson
, produced by https://github.com/allmaps/rumsey-scripts.
See the etl
directory for more details.