Skip to content

Latest commit

 

History

History
197 lines (151 loc) · 5.29 KB

File metadata and controls

197 lines (151 loc) · 5.29 KB

Batch Geocoder

Python 3.11+ License: MIT

Convert a CSV of addresses to a GeoJSON FeatureCollection — free with Nominatim, or fast with Google Maps.


Table of Contents


Features

  • Two backends — Nominatim (free, no API key) and Google Maps (fast, paid).
  • Pluggable — implement GeocoderBackend to add any geocoding provider.
  • Null-safe output — failed rows are included with null geometry so no data is lost.
  • Confidence scores — Nominatim returns an importance score per result.
  • Extra columns passthrough — carry any CSV columns through to GeoJSON properties.
  • Rate-limit aware — configurable delay between requests; respects Nominatim's 1 req/s policy.

Installation

cd tools/python/batch-geocoder

python -m venv .venv
# Windows: .venv\Scripts\activate
# macOS / Linux: source .venv/bin/activate

pip install -e .
geo-geocode --help

Add the repo root to PYTHONPATH:

# Windows: set PYTHONPATH=.
# macOS / Linux: export PYTHONPATH=.

Usage

CLI

# Nominatim (free, default)
geo-geocode \
  --input      data/addresses.csv \
  --output     output/addresses.geojson \
  --address-col full_address \
  --user-agent "my-project/1.0" \
  --extra-cols name,city,zip

# Google Maps
geo-geocode \
  --input      data/addresses.csv \
  --output     output/addresses.geojson \
  --address-col address \
  --backend    google \
  --google-api-key YOUR_GOOGLE_API_KEY \
  --rate-limit 0.05

Python API

from pathlib import Path
from src.batch_geocoder.geocoder import BatchGeocoder, NominatimBackend

tool = BatchGeocoder(
    input_path=Path("data/customers.csv"),
    output_path=Path("output/customers.geojson"),
    address_col="full_address",
    backend=NominatimBackend(
        user_agent="my-company/1.0",
        rate_limit_seconds=1.1,
    ),
    extra_cols=["name", "city", "zip"],
)
tool.run()

# Inspect results
for result in tool.results:
    print(f"{result.address} → success={result.success}, confidence={result.confidence}")

Configuration Reference

Parameter Type Default Description Example
--input / input_path Path Path to the input CSV file data/customers.csv
--output / output_path Path Path for the output GeoJSON file output/customers.geojson
--address-col / address_col str "address" Column containing address strings "full_address"
--backend "nominatim" | "google" "nominatim" Geocoding provider "nominatim"
--user-agent str "geoscripthub-geocoder/1.0" App identifier for Nominatim "my-company/1.0"
--google-api-key str env var Google Maps API key Set GOOGLE_MAPS_API_KEY env var
--rate-limit float 1.1 Seconds between requests >= 1.0 for Nominatim; ~0.05 for Google
--extra-cols str (CSV) "" Extra columns to carry into GeoJSON name,city,zip
--verbose bool False Debug logging Pass -v to see per-address results

Backends

Nominatim (default, free)

  • Powered by OpenStreetMap data.
  • No API key required.
  • Must respect the Nominatim Usage Policy: keep rate_limit_seconds >= 1.0 and set a descriptive user_agent.

Google Maps Geocoding API

  • Requires a Google Cloud project with the Geocoding API enabled.
  • Generate an API key at console.cloud.google.com → APIs & Services → Credentials.
  • Important: Store the key in an environment variable (GOOGLE_MAPS_API_KEY) — never commit it to version control.

Custom Backend

from src.batch_geocoder.geocoder import GeocoderBackend, GeocodeResult

class MyCustomBackend(GeocoderBackend):
    def geocode_one(self, address: str) -> GeocodeResult:
        # Your implementation here
        ...

Output Format

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": { "type": "Point", "coordinates": [-77.036, 38.897] },
      "properties": {
        "address": "1600 Pennsylvania Ave NW, Washington DC",
        "display_name": "White House, Washington, DC, USA",
        "confidence": 0.901,
        "geocode_success": true,
        "name": "White House"
      }
    },
    {
      "type": "Feature",
      "geometry": null,
      "properties": {
        "address": "zzz bad address",
        "geocode_success": false
      }
    }
  ]
}

Running Tests

export PYTHONPATH=../../../..  # macOS/Linux
set PYTHONPATH=../../../..     # Windows

pytest tests/ -v

Contributing

See root CONTRIBUTING.md.


License

MIT — see LICENSE.