forked from martinfleis/sds
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathexercise.qmd
77 lines (55 loc) · 3.17 KB
/
exercise.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: "Learning GeoPandas"
format: html
aliases:
- ../chapter_03/exercise.html
---
This section is about playing with `geopandas` by yourself. You will zoom at Prague and
the distribution of land price, provided by the Institute of Planning and Development.
## Data Preparation
You will load two datasets. One with the price data and the other with the boundaries of
municipal districts. Below are the links to both. Your first task is to figure out how
to load them as two `GeoDataFrames`, one called `price` and the other called `districts`.
- [The link to the price dataset](https://martinfleischmann.net/sds/geographic_data/data/SED_CenovaMapa_p_shp.zip)
- [The link to the districts dataset](https://martinfleischmann.net/sds/geographic_data/data/MAP_MESTSKECASTI_P_shp.zip)
::: {.callout-tip collapse="true"}
You have two options: either download and read from disk or pass the URL of the actual file to `geopandas.read_file`.
:::
Once you have your `price` dataset loaded, there is one cleaning step you need to do. The column with the price called
`"CENA"` is encoded as strings, not numbers. You need to replace the `"N"` string with `None` and convert the column
to `float` numbers.
```py
price["CENA"] = price["CENA"].replace("N", None).astype('float')
```
The rest is up to you!
## Map Making
Create a map of price distribution with the overlay of the district boundaries. Start with
static or interactive, and try replicating the other once you're done if the time
permits. A few requirements:
- Plot the price.
- Plot boundaries on top. Try different colours to get a nice combination of colours.
- Show a legend.
- Use CartoDB Voyager or CartoDB Dark Matter basemap.
- Can you figure out how to change the colormap?
- Can you change the transparency of polygons?
::: {.callout-tip collapse="true"}
- Colormap is controlled via the `cmap=` keyword.
- To plot a boundary, it may be easier to work with LineStrings than Polygons.
- Check the additional reading materials linked in the [previous section](hands_on.qmd) to figure out how to add basemap to static plots.
- Transparency is different in static and interactive maps. Check the [documentation](https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.explore.html#geopandas.GeoDataFrame.explore)!
:::
## Measuring
Practice using geometric methods and properties `GeoDataFrame` offers.
- Create a convex hull around each polygon in `price`.
- Calculate the area of these convex hulls.
- Find the convex hulls with an area between 40th a 60th percentile in the GeoDataFrame. Create a new object (e.g. `average`) only with them.
- Create a multi-layer map of Prague where the smallest areas are coloured in one colour, and the rest appear in another. Try making it legible enough.
## Joining
Join the two `GeoDataFrame` using `.sjoin()` or `.overlay()` methods to get data per district.
- Is the mean price higher in Praha 3 or Praha 6?
- Which district is the cheapest?
- What is the difference between the cheapest and the most expensive one?
::: {.callout-tip collapse="true"}
You may need to use `.groupby()` after joining.
For those of you who don't speak Czech, the district names are encoded in the `"NAZEV_1"` column.
:::