|
| 1 | +# 🌍 Geospatial Urban Analysis Project |
| 2 | + |
| 3 | +## 📌 Overview |
| 4 | + |
| 5 | +This project focuses on **geospatial data analysis** for urban environments, particularly analyzing **pedestrian zones, transportation networks, census data, and geographic boundaries**. The dataset includes **shapefiles, GeoJSON, Parquet, and raster files**, allowing advanced **spatial processing and visualization**. |
| 6 | + |
| 7 | +The project uses **PostgreSQL with PostGIS**, **Docker**, and **GeoPandas**, enabling **efficient spatial queries, ETL pipelines, and geospatial machine learning models**. |
| 8 | + |
| 9 | +### ✨ **Key Features** |
| 10 | + |
| 11 | +- 🏙 **Urban Infrastructure Analysis**: Analyzes bike paths, subway entrances, and school locations. |
| 12 | +- 📊 **Geospatial Data Processing**: Supports various spatial formats (Shapefile, GeoJSON, Parquet, Raster). |
| 13 | +- 🔄 **ETL Pipelines**: Extract, transform, and load urban data into **PostGIS**. |
| 14 | +- 🤖 **Geospatial Machine Learning**: Clustering models to optimize urban planning decisions. |
| 15 | +- 🗺 **Interactive Mapping**: Generates visualizations using **Folium and Matplotlib**. |
| 16 | + |
| 17 | +--- |
| 18 | + |
| 19 | +## 🛠 **Requirements** |
| 20 | + |
| 21 | +Before running the project, ensure you have the following dependencies installed: |
| 22 | + |
| 23 | +### 💻 **System Requirements** |
| 24 | + |
| 25 | +- 🐳 **Docker** (for PostgreSQL with PostGIS) |
| 26 | +- 🐍 **Python 3.8+** |
| 27 | + |
| 28 | +### 📦 **Python Dependencies** |
| 29 | + |
| 30 | +All required Python libraries are listed in `requirements.txt`. Install them using: |
| 31 | + |
| 32 | +```sh |
| 33 | +pip install -r requirements.txt |
| 34 | +``` |
| 35 | + |
| 36 | +Main dependencies: |
| 37 | + |
| 38 | +- 🌍 **GeoPandas**: Geospatial data processing. |
| 39 | +- 🗄 **PostgreSQL & PostGIS**: Geospatial database support. |
| 40 | +- 📈 **Matplotlib & Folium**: Data visualization. |
| 41 | +- 🤖 **Scikit-learn**: Clustering and machine learning models. |
| 42 | + |
| 43 | +--- |
| 44 | + |
| 45 | +## 🚀 **Setup & Installation** |
| 46 | + |
| 47 | +### 📂 **1. Clone the Repository** |
| 48 | + |
| 49 | +```sh |
| 50 | +git clone git@github.com:nanlabs/backend-reference.git |
| 51 | +cd examples/geospatial-python-urban-analysis-with-postgis |
| 52 | +``` |
| 53 | + |
| 54 | +### 🏗 **2. Set Up a Virtual Environment** |
| 55 | + |
| 56 | +Create and activate a Python virtual environment: |
| 57 | + |
| 58 | +```sh |
| 59 | +python -m venv env |
| 60 | +source env/bin/activate # On macOS/Linux |
| 61 | +env\Scripts\activate # On Windows |
| 62 | +``` |
| 63 | + |
| 64 | +Once activated, install dependencies: |
| 65 | + |
| 66 | +```sh |
| 67 | +pip install -r requirements.txt |
| 68 | +``` |
| 69 | + |
| 70 | +### 🐳 **3. Set Up Docker with PostgreSQL and PostGIS** |
| 71 | + |
| 72 | +Ensure that **Docker** is installed and running. Then, start the database with: |
| 73 | + |
| 74 | +```sh |
| 75 | +docker-compose up -d |
| 76 | +``` |
| 77 | + |
| 78 | +This will: |
| 79 | + |
| 80 | +- 🛢 Start a **PostgreSQL database** with **PostGIS** extensions enabled. |
| 81 | +- 📌 Create the necessary **database schema** for storing geospatial data. |
| 82 | + |
| 83 | +## 📖 **Working with Notebooks** |
| 84 | + |
| 85 | +To start the analysis and visualization: |
| 86 | + |
| 87 | +```sh |
| 88 | +jupyter notebook |
| 89 | +``` |
| 90 | + |
| 91 | +Then, open one of the notebooks in the `notebooks/` directory. |
| 92 | + |
| 93 | +The notebooks cover: |
| 94 | + |
| 95 | +- 🌍 **Geospatial Data Exploration**: Loading and visualizing spatial datasets. |
| 96 | +- 🚇 **Urban Accessibility Analysis**: Assessing accessibility of public transport. |
| 97 | +- 🤖 **Clustering and Machine Learning**: Applying spatial clustering algorithms. |
| 98 | + |
| 99 | + |
| 100 | +### 📌 **Pipelines Overview** |
| 101 | +The project includes **several geospatial data processing pipelines**, located in `src/pipelines/`: |
| 102 | + |
| 103 | +- 🚌 **`bus_stop_analysis.py`**: Analyzes bus stops and their spatial distribution. |
| 104 | +- 📍 **`optimal_stop_pipeline.py`**: Computes the best locations for public transportation stops. |
| 105 | +- 🗺 **`shapefile_to_raster.py`**: Converts vector-based shapefiles into raster format for GIS applications. |
| 106 | + |
| 107 | +### ⚙️ **Running Pipelines** |
| 108 | +To execute a pipeline, use the following command: |
| 109 | + |
| 110 | +```sh |
| 111 | +PYTHON=. python -m src.pipelines.bus_stop_analysis |
| 112 | +``` |
| 113 | + |
| 114 | +Replace `bus_stop_analysis` with the pipeline you want to run. |
| 115 | + |
| 116 | +Each pipeline processes geospatial data **efficiently**, ensuring the data is ready for **urban planning and visualization**. |
| 117 | + |
| 118 | +--- |
| 119 | + |
| 120 | +## 🏗 **Project Structure** |
| 121 | + |
| 122 | +```sh |
| 123 | +. |
| 124 | +├── Dockerfile # 🐳 Docker configuration for Python environment |
| 125 | +├── docker-compose.yml # 🛢 PostgreSQL + PostGIS setup |
| 126 | +├── requirements.txt # 📦 Python dependencies |
| 127 | +├── config.py # ⚙️ Configuration settings |
| 128 | +├── data/ # 🌍 Raw geospatial datasets |
| 129 | +├── notebooks/ # 📖 Jupyter Notebooks for geospatial analysis |
| 130 | +├── scripts/ # 🔄 Data processing scripts |
| 131 | +├── src/ # 🏗 Source code |
| 132 | +│ ├── database/ # 🗄 Database connection and queries |
| 133 | +│ ├── etl/ # 🔄 ETL pipeline for spatial data |
| 134 | +│ ├── ml/ # 🤖 Machine learning models for clustering |
| 135 | +│ ├── pipelines/ # 📌 Spatial data processing workflows |
| 136 | +│ ├── visualization/ # 🗺 Map and data visualization modules |
| 137 | +``` |
| 138 | + |
| 139 | +--- |
0 commit comments