This repository contains different data analysis of data distribution in the OSM dataset.
To be able to parallelize, lets extract all blocks. Full universe will take 4 minutes:
spark-submit
--class com.simplexportal.simplexspatial.analysis.Driver \
--master "local[*]" \
target/scala-2.11/simplexspatial-data-distribution-analysis-assembly-0.1.jar \
extract \
-i file:///home/angelcc/Downloads/osm/planet/planet-200309.osm.pbf \
-o file:///home/angelcc/Downloads/osm/planet/blobs