Skip to content

Commit 3e091ff

Browse files
author
Atma Mani
committed
refined spatially enabled dataframe topic
1 parent 88f5cd1 commit 3e091ff

File tree

1 file changed

+75
-33
lines changed

1 file changed

+75
-33
lines changed

guide/05-working-with-the-spatial-dataframe/introduction-to-the-spatial-dataframe.ipynb

Lines changed: 75 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,22 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Introduction to the Spatial Enabled DataFrame\n",
7+
"# Introduction to the Spatially Enabled DataFrame\n",
88
"\n",
9-
"The [`Spatially Enabled Dataframe`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#spatialdataframe) (SEDF) creates a simple, intutive object that can easily manipulate geometric and attribute data. The Spatially Enabled DataFrame is a custom namespace that is inserted into the popular [Pandas](https://pandas.pydata.org/) [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe) structure with spatial abilities, allowing you to use intutive, pandorable operations on both the attribute and spatial columns. Thus the SEDF is based on data structures inherently suited to data analysis, with natural operations for the filtering and inspecting of subsets of values which are fundamental to statistical and geographic manipulations.\n",
9+
"The [`Spatially Enabled DataFrame`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#spatialdataframe) (SEDF) creates a simple, intutive object that can easily manipulate geometric and attribute data.\n",
1010
"\n",
11-
"The dataframe reads from many **sources**, including shapefiles, [Pandas](https://pandas.pydata.org/) [DataFrames](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe), feature classes, GeoJSON, and Feature Layers.\n",
11+
"<blockquote>\n",
12+
" New at version 1.5, the Spatially Enabled DataFrame is an evolution of the <code>SpatialDataFrame</code> object that you may be familiar with. While the <code>SDF</code> object is still avialable for use, the team has stopped active development of it and is promoting the use of this new Spatially Enabled DataFrame pattern. The SEDF provides you better memory management, ability to handle larger datasets and is the pattern that Pandas advocates as the path forward.</blockquote>\n",
13+
"\n",
14+
"The Spatially Enabled DataFrame inserts a custom namespace called `spatial` into the popular [Pandas](https://pandas.pydata.org/) [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe) structure to give it spatial abilities. This allows you to use intutive, pandorable operations on both the attribute and spatial columns. Thus, the SEDF is based on data structures inherently suited to data analysis, with natural operations for the filtering and inspecting of subsets of values which are fundamental to statistical and geographic manipulations.\n",
15+
"\n",
16+
"The dataframe reads from many **sources**, including shapefiles, [Pandas](https://pandas.pydata.org/) [DataFrames](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe), feature classes, GeoJSON, and Feature Layers."
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"metadata": {},
22+
"source": [
1223
"\n",
1324
"This document outlines some fundamentals of using the `Spatially Enabled DataFrame` object for working with GIS data.\n",
1425
"\n",
@@ -43,22 +54,25 @@
4354
"## Accessing GIS data\n",
4455
"GIS users need to work with both published layers on remote servers (web layers) and local data, but the ability to manipulate these datasets without permanentently copying the data is lacking. The `Spatial Enabled DataFrame` solves this problem because it is an in-memory object that can read, write and manipulate geospatial data.\n",
4556
"\n",
46-
"The SEDF integrates with Esri's [`ArcPy site-package`](http://pro.arcgis.com/en/pro-app/arcpy/get-started/what-is-arcpy-.htm) as well as the open source [`pyshp`](https://github.com/GeospatialPython/pyshp/), [`shapely`](https://github.com/Toblerity/Shapely) and [`fiona`](https://github.com/Toblerity/Fiona) packages. This means the ArcGIS API for Python SEDF can use either of these geometry engines to provide you options for easily working with geospatial data regardless of your platform. The SEDF transforms data into the formats you desire so you can use Python functionality to analyze and visualize geographic information.\n",
47-
"\n",
48-
"Data can be read and scripted to automate workflows and just as easily visualized on maps in [`Jupyter notebooks`](../using-the-jupyter-notebook-environment/). The SEDF can export data as feature classes or publish them directly to servers for sharing according to your needs.\n",
49-
"\n",
50-
"Let's explore some of the different options available with the versatile `Spatial Enabled DataFrame` namespaces:\n",
57+
"The SEDF integrates with Esri's [`ArcPy` site-package](http://pro.arcgis.com/en/pro-app/arcpy/get-started/what-is-arcpy-.htm) as well as the open source [`pyshp`](https://github.com/GeospatialPython/pyshp/), [`shapely`](https://github.com/Toblerity/Shapely) and [`fiona`](https://github.com/Toblerity/Fiona) packages. This means the ArcGIS API for Python SEDF can use either of these geometry engines to provide you options for easily working with geospatial data regardless of your platform. The SEDF transforms data into the formats you desire so you can use Python functionality to analyze and visualize geographic information.\n",
5158
"\n",
59+
"Data can be read and scripted to automate workflows and just as easily visualized on maps in [`Jupyter notebooks`](../using-the-jupyter-notebook-environment/). The SEDF can export data as feature classes or publish them directly to servers for sharing according to your needs. Let's explore some of the different options available with the versatile `Spatial Enabled DataFrame` namespaces:"
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"metadata": {},
65+
"source": [
5266
"### Reading Web Layers\n",
5367
"\n",
54-
"[`Feature layers`](https://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm) hosted on [**ArcGIS Online**](https://www.arcgis.com) or [**ArcGIS Enterprise**](http://enterprise.arcgis.com/en/) can be easily read into a Spatial DataFrame using the [`from_layer`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#arcgis.features.SpatialDataFrame.from_layer) method. Once you read it into a SEDF object, you can create reports, manipulate the data, or convert it to a form that is comfortable and makes sense for its intended purpose.\n",
68+
"[`Feature layers`](https://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm) hosted on [**ArcGIS Online**](https://www.arcgis.com) or [**ArcGIS Enterprise**](http://enterprise.arcgis.com/en/) can be easily read into a Spatially Enabled DataFrame using the [`from_layer`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method. Once you read it into a SEDF object, you can create reports, manipulate the data, or convert it to a form that is comfortable and makes sense for its intended purpose.\n",
5569
"\n",
5670
"**Example: Retrieving an ArcGIS Online [`item`](https://developers.arcgis.com/rest/users-groups-and-items/publish-item.htm) and using the [`layers`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.gis.toc.html#layer) property to inspect the first 5 records of the layer**"
5771
]
5872
},
5973
{
6074
"cell_type": "code",
61-
"execution_count": 2,
75+
"execution_count": 6,
6276
"metadata": {},
6377
"outputs": [
6478
{
@@ -263,7 +277,7 @@
263277
"[5 rows x 51 columns]"
264278
]
265279
},
266-
"execution_count": 2,
280+
"execution_count": 6,
267281
"metadata": {},
268282
"output_type": "execute_result"
269283
}
@@ -273,18 +287,53 @@
273287
"gis = GIS()\n",
274288
"item = gis.content.get(\"85d0ca4ea1ca4b9abf0c51b9bd34de2e\")\n",
275289
"flayer = item.layers[0]\n",
290+
"\n",
291+
"# create a Spatially Enabled DataFrame object\n",
276292
"sdf = pd.DataFrame.spatial.from_layer(flayer)\n",
277293
"sdf.head()"
278294
]
279295
},
296+
{
297+
"cell_type": "markdown",
298+
"metadata": {},
299+
"source": [
300+
"When you inspect the `type` of the object, you get back a standard pandas `DataFrame` object. However, this object now has an additional `SHAPE` column that allows you to perform geometric operations. In other words, this `DataFrame` is now geo-aware."
301+
]
302+
},
303+
{
304+
"cell_type": "code",
305+
"execution_count": 7,
306+
"metadata": {},
307+
"outputs": [
308+
{
309+
"data": {
310+
"text/plain": [
311+
"pandas.core.frame.DataFrame"
312+
]
313+
},
314+
"execution_count": 7,
315+
"metadata": {},
316+
"output_type": "execute_result"
317+
}
318+
],
319+
"source": [
320+
"type(sdf)"
321+
]
322+
},
323+
{
324+
"cell_type": "markdown",
325+
"metadata": {},
326+
"source": [
327+
"Further, the `DataFrame` has a new `spatial` property that provides a list of geoprocessing operations that can be performed on the object. The rest of the guides in this section go into details of how to use these functionalities. So, sit tight."
328+
]
329+
},
280330
{
281331
"cell_type": "markdown",
282332
"metadata": {},
283333
"source": [
284334
"### Reading Feature Layer Data\n",
285335
"\n",
286-
"As seen above, the SEDF can consume a `Feature Layer` service accessible on the ArcGIS Online platform. Let's take a step-by-step approach to break down the notebook cell above and then extract a subset of records from the feature layer.\n",
287-
"\n",
336+
"As seen above, the SEDF can consume a `Feature Layer` served from either ArcGIS Online or ArcGIS Enterprise orgs. Let's take a step-by-step approach to break down the notebook cell above and then extract a subset of records from the feature layer.\n",
288337
"\n",
289338
"#### Example: Examining Feature Layer content"
290339
]
@@ -298,7 +347,7 @@
298347
},
299348
{
300349
"cell_type": "code",
301-
"execution_count": 3,
350+
"execution_count": 9,
302351
"metadata": {},
303352
"outputs": [
304353
{
@@ -316,7 +365,7 @@
316365
" </a>\n",
317366
" <br/>This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.<img src='https://www.arcgis.com/home/js/jsapi/esri/css/images/item_type_icons/featureshosted16.png' style=\"vertical-align:middle;\">Feature Layer Collection by esri_dm\n",
318367
" <br/>Last Modified: December 21, 2017\n",
319-
" <br/>3 comments, 291,531 views\n",
368+
" <br/>3 comments, 331,873 views\n",
320369
" </div>\n",
321370
" </div>\n",
322371
" "
@@ -325,7 +374,7 @@
325374
"<Item title:\"USA Major Cities\" type:Feature Layer Collection owner:esri_dm>"
326375
]
327376
},
328-
"execution_count": 3,
377+
"execution_count": 9,
329378
"metadata": {},
330379
"output_type": "execute_result"
331380
}
@@ -338,7 +387,7 @@
338387
},
339388
{
340389
"cell_type": "code",
341-
"execution_count": 4,
390+
"execution_count": 10,
342391
"metadata": {},
343392
"outputs": [
344393
{
@@ -543,7 +592,7 @@
543592
"[5 rows x 51 columns]"
544593
]
545594
},
546-
"execution_count": 4,
595+
"execution_count": 10,
547596
"metadata": {},
548597
"output_type": "execute_result"
549598
}
@@ -570,15 +619,14 @@
570619
"cell_type": "markdown",
571620
"metadata": {},
572621
"source": [
573-
"You can also use sql queries to return a subset of records by leveraging the ArcGIS API for Python's [`Feature Layer`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#featurelayer) object itself. Instantiate a Pandas `DataFrame` directly from the [`FeatureLayer.query()`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method and use the data frame's [`head()`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.head.html#pandas.core.groupby.GroupBy.head) method to return the first 5 records and a subset of columns from the DataFrame:\n",
574-
"\n",
575-
"#### Example: Feature Layer Query Results to a Spatial DataFrame"
622+
"You can also use sql queries to return a subset of records by leveraging the ArcGIS API for Python's [`Feature Layer`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#featurelayer) object itself. When you run a [`query()`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) on a `FeatureLayer`, you get back a `FeatureSet` object. Calling the `sdf` property of the `FeatureSet` returns a Spatially Enabled DataFrame object. We then use the data frame's [`head()`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.head.html#pandas.core.groupby.GroupBy.head) method to return the first 5 records and a subset of columns from the DataFrame:"
576623
]
577624
},
578625
{
579626
"cell_type": "markdown",
580627
"metadata": {},
581628
"source": [
629+
"#### Example: Feature Layer Query Results to a Spatially Enabled DataFrame\n",
582630
"We'll use the `AGE_45_54` column to query the dataframe and return a new `DataFrame` with a subset of records. We can use the built-in [`zip()`](https://docs.python.org/3/library/functions.html#zip) function to print the data frame attribute field names, and then use data frame syntax to view specific attribute fields in the output:"
583631
]
584632
},
@@ -716,12 +764,12 @@
716764
"\n",
717765
"The SEDF can also access local geospatial data. Depending upon what Python modules you have installed, you'll have access to a wide range of functionality: \n",
718766
"\n",
719-
"* If the **`ArcPy`** module is installed, meaning you have installed [`ArcGIS Pro`](http://pro.arcgis.com/en/pro-app/) and have installed the ArcGIS API for Python in that same environment, the `SpatialDataFrame` has methods to read a subset of the ArcGIS Desktop [supported geographic formats](http://desktop.arcgis.com/en/arcmap/10.3/manage-data/datatypes/about-geographic-data-formats.htm#ESRI_SECTION1_4835793C55C0439593A46FD5BC9E64B9), most notably:\n",
767+
"* If the **`ArcPy`** module is installed, meaning you have installed [`ArcGIS Pro`](http://pro.arcgis.com/en/pro-app/) and have installed the ArcGIS API for Python in that same environment, the `DataFrame` then has methods to read a subset of the ArcGIS Desktop [supported geographic formats](http://desktop.arcgis.com/en/arcmap/10.3/manage-data/datatypes/about-geographic-data-formats.htm#ESRI_SECTION1_4835793C55C0439593A46FD5BC9E64B9), most notably:\n",
720768
" * [`feature classes`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm)\n",
721769
" * [`shapefiles`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm), \n",
722770
" * [`ArcGIS Server Web Services`](https://enterprise.arcgis.com/en/server/latest/publish-services/windows/what-types-of-services-can-you-publish.htm) and [`ArcGIS Online Hosted Feature Layers`](https://doc.arcgis.com/en/arcgis-online/share-maps/publish-features.htm) \n",
723771
" * [`OGC Services`](http://www.opengeospatial.org/standards) \n",
724-
"* If the **ArcPy** module is not installed, the SEDF [`from_featureclass`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#arcgis.features.SpatialDataFrame.from_featureclass) method only supports consuming an Esri [`shapefile`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm)\n",
772+
"* If the **ArcPy** module is not installed, the SEDF [`from_featureclass`](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method only supports consuming an Esri [`shapefile`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm)\n",
725773
"> Please note that you must install the `pyshp` package to read shapefiles in environments that don't have access to `ArcPy`.\n",
726774
" \n",
727775
"### Example: Reading a Shapefile\n",
@@ -882,9 +930,7 @@
882930
}
883931
],
884932
"source": [
885-
"sdf = (pd.DataFrame\n",
886-
" .spatial\n",
887-
" .from_featureclass(\"path\\to\\your\\data\\census_example\\cities.shp\"))\n",
933+
"sdf = pd.DataFrame.spatial.from_featureclass(\"path\\to\\your\\data\\census_example\\cities.shp\")\n",
888934
"sdf.tail()"
889935
]
890936
},
@@ -931,9 +977,7 @@
931977
}
932978
],
933979
"source": [
934-
"(sdf\n",
935-
" .spatial\n",
936-
" .to_featureclass(location=r\"c:\\output_examples\\census.shp\"))"
980+
"sdf.spatial.to_featureclass(location=r\"c:\\output_examples\\census.shp\")"
937981
]
938982
},
939983
{
@@ -987,9 +1031,7 @@
9871031
],
9881032
"source": [
9891033
"columns = ['NAME', 'ST', 'CAPITAL', 'STFIPS', 'POP2000', 'POP2007', 'SHAPE']\n",
990-
"(sdf[columns].head()\n",
991-
" .spatial\n",
992-
" .to_featureclass(location=r\"/path/to/your/data/directory/sdf_head_output.shp\"))"
1034+
"sdf[columns].head().spatial.to_featureclass(location=r\"/path/to/your/data/directory/sdf_head_output.shp\")"
9931035
]
9941036
}
9951037
],
@@ -1009,7 +1051,7 @@
10091051
"name": "python",
10101052
"nbconvert_exporter": "python",
10111053
"pygments_lexer": "ipython3",
1012-
"version": "3.6.5"
1054+
"version": "3.6.6"
10131055
}
10141056
},
10151057
"nbformat": 4,

0 commit comments

Comments
 (0)