Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add CSC raster lesson from notebooks material #121

Open
wants to merge 1 commit into
base: csc_2022-03
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions source/lessons/Raster/overview.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
Overview
========

For the Python GIS course at CSC in March 2022, most of below lessons have been combined to a 'usecase' lesson about finding out how the trees of Seurasaari were doing in summer 2021, which can be found `here <../../notebooks/Raster/Raster_CSC/Seurasaari_trees.ipynb>`__


In this tutorial, you will learn how to conduct many basic raster processing operations using ``rasterio`` module.

1. :doc:`Downloading data automatically with Python <download-data>`
Expand All @@ -12,5 +15,3 @@ In this tutorial, you will learn how to conduct many basic raster processing ope
7. `Zonal statistics <../../notebooks/Raster/zonal-statistics.ipynb>`__
8. `Reading cloud optimized GeoTIFFs <../../notebooks/Raster/read-cogs.ipynb>`__



1,999 changes: 1,999 additions & 0 deletions source/notebooks/Raster/Raster_CSC/Seurasaari_trees.ipynb

Large diffs are not rendered by default.

Binary file not shown.

Large diffs are not rendered by default.

29 changes: 29 additions & 0 deletions source/notebooks/Raster/Raster_CSC/cyu1_solution.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Test your understanding:

# Something seems off about the histogram, can you find out what and fix it? (Hint: check the shape of stack)

# check that we have the data in correct format for show_hist (bands, height, width)

print("first fix")

print(falsecolor_stack.shape)
stack_for_hist = np.transpose(falsecolor_stack,axes=[2,0,1])
print(stack_for_hist.shape)

# and plot the histogram

show_hist(stack_for_hist, bins=50, lw=0, stacked=False, alpha=0.9,histtype="stepfilled")

# The basic output of the histogram function is not very nice and even wrong in one place, can you do better? (Hint: show_hist also takes matplotlib axes (`ax=`) and title `title=` as optional parameters)

# add labels (we know that nir is first, red, is second and green is third band in our stack) using label
# add descriptive title
# adjust x-axis label by using matplotlibs axis to reflect what is actually shown in our case

print("second fix")

fig, ax = plt.subplots()
show_hist(stack_for_hist, bins=50, lw=0, stacked=False, alpha=0.9,histtype="stepfilled", label = ["nir","red","green"], title = "Histogram Sentinel-2 false color stack", ax = ax)
ax.set_xlabel("Reflectance")

# in this case the histogram could also be plotted using only matplotlib.pyplot
18 changes: 18 additions & 0 deletions source/notebooks/Raster/Raster_CSC/cyu2_solution.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#Test your understanding:

#* In above plot we do not have Coordinates at the plot, why?

print(type(ndvi))

# -> array, does not have any coordinates attached

# How could we get them?

# -> plot the newly saved file

with rasterio.open("./data/S2_NDVI_Seurasaari.tif") as src:
show(src, title="Read from saved NDVI geotiff")

# We could also pass the transform with the array to 'show'

show(ndvi, transform=seurasaariS2.transform, title="NDVI array with transform")
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ISO-8859-1
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PROJCS["EUREF_FIN_TM35FIN",GEOGCS["GCS_ETRS_1989",DATUM["D_ETRS_1989",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",27.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ISO-8859-1
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PROJCS["WGS_1984_UTM_Zone_35N",GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",27.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]
Binary file not shown.
Binary file not shown.
Binary file added source/notebooks/Raster/Raster_CSC/img/ndvi.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
90 changes: 90 additions & 0 deletions source/notebooks/Raster/Raster_CSC/prepare_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
#!/usr/bin/env python3

"""
Script for the preparation of raster lesson Sentinel 2 data
-- not part of the course, data prepared with this is provided: https://a3s.fi/gis-courses/pythongis_2022/S2B_RGBNIR_20210926_Helsinki.tif ---
"""


import os
import rasterio
from rasterio.mask import mask
import fiona
from fiona.transform import transform_geom
import numpy as np
import glob


def find_bands(S2SAFEfile, pixelsize):
"""
find bands from Sentinel-2 SAFE "file" with given pixelsize
"""

bandlocation = [S2SAFEfile,'*','*','IMG_DATA']

bandpaths = {}
# we want r,g,b and nir at 10 m spatial resolution
for band in [2,3,4,8]:
pathbuildinglist = bandlocation + ['R' + str(pixelsize) + 'm','*' + str(band)+ '_' + str(pixelsize) +'m.jp2']
bandpathpattern = os.path.join(*pathbuildinglist)
bandpath = glob.glob(bandpathpattern)[0]

bandpaths[band] = bandpath

#type: dict[bandnumber] - bandpath
return bandpaths


def create_multiband_tif(bandpaths):
"""
create multiband geotif file from given bandpathdict
"""

outfilename = "./data/S2B_RGBNIR_20210926.tif"
with rasterio.open(bandpaths[4]) as src:
meta = src.meta
crs = src.crs
out_meta = meta.copy()
out_meta.update({"driver": "GTiff","count":4})
with rasterio.open(outfilename, "w", **out_meta) as dest:
for i,band in enumerate(bandpaths.keys()):
with rasterio.open(bandpaths[band]) as src:
banddata = src.read(1)
dest.write(banddata, i+1)

return outfilename, crs


def read_shapefile():
"""
reads in shapefile of Finland municipalities and returns geometry of Helsinki polygon
"""

with fiona.open("../L2/data/finland_municipalities.shp", "r") as shapefile:
polygons = [feature["geometry"] for feature in shapefile] if feature['properties']['name'] == 'Helsinki'])
#polycrs = str(shapefile.crs['init']).upper()
return polygons#, polycrs


def clip_area(mytif, polygons, outname):
"""
clip area of polygons from input raster file
"""

with rasterio.open(mytif) as src:
out_image, out_transform = mask(src, polygons, crop=True)
out_meta = src.meta
out_meta.update({"driver": "GTiff",
"height": out_image.shape[1],
"width": out_image.shape[2],
"transform": out_transform})


with rasterio.open(outname, "w", **out_meta) as dest:
dest.write(out_image)

# Process for getting Sentinel2 4 band raster of Helsinki
bandpaths = find_bands("./data/S2B_MSIL2A_20210926T094029_N0301_R036_T35VLG_20210926T110446.SAFE",10)
mytif,crs= create_multiband_tif(bandpaths)
polygon = read_shapefile()
clip_area(mytif, polygon, "./data/S2B_RGBNIR_20210926_Helsinki.tif")
223 changes: 223 additions & 0 deletions source/notebooks/Raster/Raster_CSC/seurasaari_toolbox.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
"""

Functions for use in Seurasaari trees notebook (Raster lesson of Introduction to Python GIS at CSC 2022)

last updated: 10.03.2022

"""
#################################
#download_from_url (adjusted from AutoGIS course material)

import os
import urllib

def get_filename(url):
"""
Parses filename from given url
"""
if url.find('/'):
return url.rsplit('/', 1)[1]

def download_data(url):
"""
Downloads data from given url
"""

# Filepaths
outdir = r"data"

# Create folder if it does not exist
if not os.path.exists(outdir):
os.makedirs(outdir)

# Parse filename
fname = get_filename(url)
outfp = os.path.join(outdir, fname)
# Download the file if it does not exist already
if not os.path.exists(outfp):
print("Downloading", fname)
r = urllib.request.urlretrieve(url, outfp)


#################################

# histogram stretch

import numpy as np

def stretch(array):
"""
remove lowest and highest 2 percentile and perform histogram stretch on array
"""

min_percent = 2
max_percent = 98
lowp, highp = np.nanpercentile(array, (min_percent, max_percent))

# Apply linear "stretch" from lowp to highp
outar = (array.astype(float) - lowp) / (highp-lowp)

#fix range to (0 to 1)
outar = np.maximum(np.minimum(outar*1, 1), 0)

return outar

#################################

# get corine legend from excel file

import pandas as pd

def excel_to_df(catxls):
# read excel file into dataframe
catdf = pd.read_excel(catxls, index_col=None)
# limit dataframe from excel to only contain class value and its english name (level4)
catdf_lim = catdf[["Value","Level4Eng"]].set_index("Value")
return catdf_lim


def get_corine_dict(catxls):
"""
transform limited dataframe from excel to dictionary
"""
catdf_lim = excel_to_df(catxls)
# transform to dictionary
catdict = catdf_lim.to_dict()["Level4Eng"]

return catdict

#################################

#get_json_feature

import json

def getFeatures(gdf):
"""Function to parse features from GeoDataFrame in such a manner that rasterio wants them"""

return [json.loads(gdf.to_json())["features"][0]["geometry"]]


#################################

#for false color image

import rasterio

def read_band(s2, bandnumber):
"""
reads band, rescales and removes artifacts from array
"""

# read band into array
band = s2.read(bandnumber)
# rescale
band = band / 10000
# remove all artifacts
band[ band>1 ] = 1

return band

import numpy as np

def make_false_color_stack(raster):
"""
read nir,red,green, stretch and create stack of arrays for false color composite
"""

nir = read_band(raster,4)
red = read_band(raster,3)
green = read_band(raster,2)

nirs = stretch(nir)
reds = stretch(red)
greens = stretch(green)

# Create RGB false color composite stack
rgb = np.dstack((nirs, reds, greens))

return rgb

#################################

#zonal_stats_percentage

def get_zonal_stats_percentage(zstats):
"""
transforms the result of categorical zonal stats from number of pixels into rounded percentages of the whole polygon
"""

zstat_perc = {}
# total amount of pixels is stored as 'count'
total = zstats[0]["count"]
# loop through classes
for key in zstats[0].keys():
if not key == "count":
# calcualate percentage of total and store in dictionary
amount = zstats[0][key]
perc= round(amount/total *100)
zstat_perc[key] = perc

return zstat_perc


#################################

def get_forest_codes():
"""
gets codes for all forest classes from corine dataframe
"""

# Lets consider only forest (that is present on Seurasaari)
forest = ["Broad-leaved forest on mineral soil",
"Coniferous forest on mineral soil",
"Coniferous forest on rocky soil",
"Mixed forest on mineral soil",
"Mixed forest on rocky soil"]

# we only need to look at value and english description
catdf_lim = excel_to_df("https://geoportal.ymparisto.fi/meta/julkinen/dokumentit/CorineMaanpeite2018Luokat.xls")
# and we only want those classes that include forest
forestdf = catdf_lim[catdf_lim["Level4Eng"].isin(forest)]
# as list
forestcodelist = forestdf.index.to_list()

return forestcodelist

#################################

# forestmask

def create_forest_mask(corinearray,forestcodes):
"""
create mask from corine with 1 for forest, 0 other
"""
# mask places where corine values are those of the forest classes
mask = (corinearray == int(forestcodes[0])) | (corinearray == int(forestcodes[1])) | (corinearray == int(forestcodes[2])) | (corinearray == int(forestcodes[3])) | (corinearray == int(forestcodes[4]))

# store as integers
mask = mask.astype("uint8")

return mask

#################################

#reproject shapefile

import geopandas as gpd

def get_reprojected_shapefilename(rastercrs, shapefilename):
"""
reproject shapefile to given rastercrs, store it and return its name
"""

# read shapefile into dataframe
df = gpd.read_file(shapefilename)
# reproject to rasters CRS
df = df.to_crs(rastercrs)
# build the outputname from the inputname
outname = os.path.join("data",os.path.splitext(os.path.basename(shapefilename))[0] + "_repr_32635.shp")
# write to disk
df.to_file(driver = "ESRI Shapefile", filename= outname)
# return the name of the reprojected shapefile
return outname