A scalable Python pipeline to visualize AIS vessel tracks from large Parquet datasets (e.g., 11GB) as tiled, high-quality maps.
- Scalable Processing: Built with Dask and Datashader to handle datasets larger than memory.
- Seamless Tiling: Calculates a global maximum across all tiles to ensure consistent color scaling and eliminate edge artifacts.
- Smart Transparency: Automatically adapts any colormap (e.g., Crameri Oslo) to be gradually transparent in low-density areas.
- Flexible Pyramid: Generates any range of zoom levels (e.g., 0-14), allowing for smooth zooming from global view to meter-scale details. For example, processing Zoom 14 on 200 nodes takes ~1.25 hours and produces ~56 GB of compressed data. This represents a total image size of approximately 16.7 million x 16.7 million pixels (280 Terapixels).
- Robust Format: Uses Zarr for intermediate storage to handle multi-dimensional categorical data efficiently.
- Dual Formats: Exports both PNG (for display) and Cloud Optimized GeoTIFF (COG) (for analysis).
- Anti-Aliasing: Renders tracks as smooth lines (
LineString) with anti-aliasing. - Configurable: All settings (bbox, zoom, palette) are defined in
config.toml.
High-resolution, anti-aliased renderings showing vessel track density with sharp contrast at all zoom levels.
Custom transparent colormaps used for visualization. Any matplotlib or crameri colormap can be used.
Crameri colormaps source: Crameri, F. (2018). Scientific colour-maps. Zenodo. doi:10.5281/zenodo.1243862
| Crameri Oslo (L=20%) | Brown / Gold |
|---|---|
![]() |
![]() |
Some Marine Cadastre datasets (e.g., 2024) may have broken zip files. Fix them using:
zip -FF AISVesselTracks2024.zip --out AISVesselTracks2024-fixed.zipThis project uses uv for dependency management.
# Install dependencies
uv syncConvert the raw AIS data (WKB Parquet or GPKG) into a spatially partitioned GeoParquet file. This significantly speeds up rendering.
From Parquet:
uv run preprocess.py --input-file /path/to/raw.parquet --output-file /path/to/processed.parquetFrom GPKG:
uv run preprocess.py --input-file /path/to/data.gpkg --output-file /path/to/processed.parquetNote: GPKG conversion via Python can be slow. For faster results, use ogr2ogr first:
ogr2ogr -f Parquet -t_srs EPSG:3857 raw.parquet input.gpkg
uv run preprocess.py --input-file raw.parquet --output-file processed.parquetEdit config.toml to customize the visualization:
[data]
input_file = "/path/to/processed.parquet"
[visualization]
zoom = 5
tile_size = 1024
# line_width = 1 # Anti-aliased lines (values are coverage 0-1)
line_width = 0 # Aliased lines (values are integer counts)
bbox = [-125.0, 24.0, -66.0, 49.0] # US Bounds
[style]
colormap = "oslo"Note on GeoTIFF Values:
- If
line_width = 1(default for aesthetics), the output GeoTIFFs contain anti-aliased coverage values (typically 0.0 to 1.0 per pixel). - If
line_width = 0, the output GeoTIFFs contain raw integer counts of vessel tracks passing through each pixel. Use this for analysis.
For a detailed analysis of aliasing, saturation, and mass conservation, see Line Width Analysis.
For Visualization (Default):
Use line_width = 1. This produces smooth, anti-aliased lines that represent the spatial coverage of the vessel track. It creates visually pleasing maps where diagonal lines appear continuous and smooth.
For Analysis (Fair Counts):
Use line_width = 0. This uses Bresenham's algorithm to select exactly one pixel per step along the major axis. This produces Integer Counts, which is the "fairest" way to count events (vessel transits) without introducing fractional artifacts.
Generate raw count data (Zarr) for the highest zoom level (e.g., Zoom 7). Zarr is used to support multi-dimensional categorical data and parallel writes.
# Use input file from config.toml
uv run ais-shader render
# Override input file via CLI
uv run ais-shader render --input-file /path/to/other_dataset.parquet
# Use a shared Dask scheduler (recommended for large datasets)
uv run ais-shader render --scheduler tcp://127.0.0.1:8786
# Resume an interrupted run
uv run ais-shader render --resume-dir rendered/run_YYYYMMDD_HHMMSS
# Regional Rendering (e.g., NYC Port at Zoom 10)
uv run ais-shader render --bbox -74.05 40.65 -74.00 40.70 --zoom 10This will output .zarr files to rendered/run_YYYYMMDD_HHMMSS/zarr/.
Process the raw Zarr files to generate seamless, transparent PNGs and lower zoom levels (pyramid).
# Run post-processing on a specific run directory
uv run ais-shader post_process --run-dir rendered/run_YYYYMMDD_HHMMSS --base-zoom 7
# Optional: Clean up intermediate Zarr files to save space
uv run ais-shader post_process --run-dir rendered/run_YYYYMMDD_HHMMSS --base-zoom 7 --clean-intermediateThis script will:
- Calculate the Global Max density across all tiles to ensure consistent coloring (no seams).
- Render PNGs using a custom "Electric Blue" colormap with transparency for low counts.
- Generate Pyramid levels (Zoom 0-6) by aggregating the base zoom data.
To view the generated PNG tiles in a browser or QGIS as an XYZ layer:
- Start a simple HTTP server in the
rendered/run_.../pngdirectory:cd rendered/run_YYYYMMDD_HHMMSS/png python -m http.server 8000 - Open QGIS and add a new XYZ Tiles connection:
- URL:
http://localhost:8000/{z}/{x}/{y}.png - Name: AIS Tracks Local
- URL:
To view the raw data or high-resolution exports:
- Open QGIS.
- Drag and drop the
.tiffiles fromrendered/run_.../tiff/directly into the map canvas. - Since they are COGs, QGIS will handle them efficiently. You can style them using "Singleband pseudocolor".
- Styles: We provide pre-configured QGIS Layer Style files (
.qml) indocs/styles/:ais_blue.qml: The default "Electric Blue" style.ais_dark.qml: A high-contrast dark theme (Crameri Oslo inspired).ais_light.qml: A clean light theme (Crameri Batlow inspired). To use them: Right-click the layer -> Properties -> Symbology -> Style -> Load Style...
- Architecture & Design: Details on the technology stack, partitioning strategy, and known issues.
- Pipeline Schematic:
- Data Loading: Reads the preprocessed GeoParquet file using
dask-geopandas. - Tiling: Calculates the list of Web Mercator tiles for the configured BBox and Zoom.
- Processing: For each tile:
- Filters the dataset using spatial indexing (
.cx). - Computes the subset to a local GeoDataFrame.
- Renders the tracks using Datashader (
cvs.line). - Applies the colormap and transparency.
- Applies the colormap and transparency.
- Filters the dataset using spatial indexing (
The pipeline generates the following directory structure:
rendered/
run_YYYYMMDD_HHMMSS/
metadata.json # Run configuration and details
zarr/ # Intermediate Zarr files (compressed)
tile_7_*.zarr # Base zoom tiles
tile_6_*.zarr # aggregated tiles
...
png/ # Visualized PNG tiles
7/ # Zoom 7
6/ # Zoom 6
...
tiff/ # Cloud Optimized GeoTIFFs (if --cogs used)
tile_7_*.tif
Estimates based on Marine Cadastre AIS data (https://hub.marinecadastre.gov/), using zlib compression (level 5) and int32 data types:
- Zoom 7: ~93 MB (143 tiles)
- Zoom 10: ~5.7 GB (12,525 tiles)
src/: Source code modules.data_loader.py: Data loading logic.renderer.py: Rendering and export logic.
visualize_tracks.py: Main entry point.preprocess.py: Data preprocessing script.config.toml: Configuration file.
This project builds upon research and visualization techniques developed at TU Delft.
- "The North Sea is ready for its close-up" (2021). TU Delft Stories. Link
- Solange van der Werff (PhD Candidate, TU Delft). Research on "Merging Multiple Perspectives to Extend Views on Nautical Systems", including high-resolution AIS visualization and safety monitoring.





