A Python package that automates the exploratory spatial data analysis (ESDA) process by summarising the results into an HTML report.
- Introduction
- Key features
- Installation
- Dependancies
- Usage
- Examples
- Contributing
- License
- References
- Credits
Exploratory spatial data analysis (ESDA) is a term used to describe a various functions used to gain a surface-level understanding of a spatial dataset. Currently the ESDA process is repetitive as each of these functions need to be calculated individually. This makes it quite a time consuming process and also includes a large margin for human-induced errors. Additionally, results are not often easily viewed side-by-side for easy comparison and sharing with people who may not have the technical skills to do so.
autoesda is the solution to this by allowing the user to execute one line of code to generate an information-rich HTML report that can easily be shared with others.
- HTML output report
- Extent map
- Dataset overview (coordinate system, number of rows/columns, which rows/columns have been included/excluded in the report)
- Descriptive statistics (count, mean, standard deviation, minimum/maximum, 25th/50th/75th percentiles)
- Sample of dataset
- Boxplot
- Histogram
- Moran's I simulation (moran's I, number of features, p-value, z-score, number of permutations)
- Local Indicator of Spatial Autocorrelation (local scatterplot, LISA cluster map)
- Choropleth maps (quantiles, equal intervals, natural breaks, and percentiles classification schemes)
- Correlation (correlation matrix/heatmap, pairwise plot)
autoesda is available on PyPI, to install autoesda, run this command in your terminal:
pip install autoesda
geopandas is a primary dependancy of autoesda and there are known challenges assosciated with using pip to install geopandas. The recommended strategy is thus, to use autoesda in a conda environment
.
For advanced users, you can follow this documentation which will guide you through the geopandas installation by downloading the unofficial binary files of some of the geopandas dependancies.
autoesda is also available on conda-forge. If you have Anaconda or Miniconda installed on your computer you can use this command in your Anaconda/Miniconda prompt:
conda install autoesda
To start off with, you need to ensure that you have imported both geopandas and autoesda.
import geopandas as gpd
import autoesda
Once both libraries have been sucessfully imported, you can import your dataset as a GeoDataFrame. This is done using geopandas. To read more about compatible file types, read the geopandas documentation. In this example, a shapefile is imported.
gdf = gpd.read_file(r'example-file-path\example-shapefile.shp')
Once your data is stored in a GeoDataFrame, you can generate the report.
autoesda.generate_report(gdf)
The report will be saved to your working file directory.
Vector Reports | Raster Reports |
---|---|
Old COJ Demographic Data | Global Terrestrial Precipitation Band 1 | Band 2 | Band 3 | Band 4 | Stacked |
AirbBnB Chicago 2015 | EU NOx Concentration Band 1 | Band 2 | Band 3 | Band 4 | Stacked |
Grid 100 | South African Population Band 1 | Band 2 | Band 3 | Band 4 |
South African 2011 Census | |
Natural Earth Country Boundaries | |
Malaria in Colombia | |
USA Election Results |
Click here to report bugs
Click here to request a new feature
If you would like to assist with fixing bugs, further development or writing documentation you are most welcome to do so. Use the issues page to guide what you can assist with.
In order to make a contribution you will need to:
- Fork the autoesda repository on GitHub.
- Clone your fork locally.
- Commit your changes to your branch on GitHub
- Once you are satsfied that your work is suitable, submit a pull request through the GitHub website.
This software is available under the BSD-3-Clause license.
For more information, see the LICENSE file which contains details on the history of this software, terms & conditions for usage, and a disclaimer of all warranties.
When citing this library, please reference the following:
de Kock, N., Rautenbach, V., and Fabris-Rotelli, I.: TOWARDS AN OPEN SOURCE PYTHON LIBRARY FOR AUTOMATED EXPLORATORY SPATIAL DATA ANALYSIS, Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B4-2022, 91–98, https://doi.org/10.5194/isprs-archives-XLIII-B4-2022-91-2022, 2022.
This package was created with Cookiecutter and the giswqs/pypackage project template.