-
Notifications
You must be signed in to change notification settings - Fork 59
Observation Data format
Weather station data are most often stored in the form of text/csv files. In the following, we describe the standard format for observational datasets, which will be stored as a collection of csv files strictly following this structure:
This file contains the information regarding the weather stations. The first three columns are the minimum information required for defining an station dataset, so these are compulsory. The remaining data (altitude, location, WMO_Id and Koppen.class) are an example of optional metadata than can be additionally included in the dataset. The datasets can have as many metadata as one may want, but the first three columns station_id, longitude and latitude are compulsory, and their names must match exactly the ones shown in this example.
station_id,longitude,latitude,altitude,location,WMO_Id,Koppen.class SP000008027,-2.0392,43.3075,251,SAN SEBASTIAN - IGUELDO,8027,Cfb SP000008181,2.0697,41.2928,4,BARCELONA/AEROPUERTO,8181,Csa SP000008202,-5.4981,40.9592,790,SALAMANCA AEROPUERTO,8202,BSk SP000008215,-4.0103,40.7806,1894,NAVACERRADA,8215,Csb SP000008280,-1.8631,38.9519,704,ALBACETE LOS LLANOS,8280,BSk SP000008410,-4.8458,37.8442,90,CORDOBA AEROPUERTO,8410,Csa
This file contains the information regarding the variables contained in the dataset, including their identification (variable), description (longname), units of measure (unit), the code used to identify missing data (missing_code) and other info that can be optionally included.
variable, longname, unit, missing_code, type, source, url precip, total precip accumulated in 24 hours, 0.1 mm, NaN, observation, Global Station Network, ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/gsn/ tmin, minimum daily temperature, 0.1 degC, NaN, observation, Global Station Network, ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/gsn/ tmax, maximum daily temperature, 0.1 degC, NaN, observation, Global Station Network, ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/gsn/
Variables are stored separately in text files named as indicated by the variable field in the variables.txt file. The first column of the file represents the observation date dates, following the format YYYYMMDD. More exceptionally in downscaling applications, time records for subdaily data can be indicated using the format YYYYMMDDHH. The remaining columns (2 to n) correspond to the observed series at each station, following the order of the stations.txt file. This is a (truncated) example file for the minimum daily temperature data of this dataset:
"YYYYMMDD","SP000008027","SP000008181","SP000008202","SP000008215","SP000008280","SP000008410" 19790225,NaN,0.6,NaN,NaN,NaN,0.6 19790226,NaN,5,NaN,NaN,NaN,2 19790227,NaN,2.2,NaN,NaN,NaN,-1 19790228,NaN,2,NaN,NaN,NaN,1 19790301,2.8,5.8,-2,-8.6,-1,2.8 19790302,4,4.8,-3,-7.4,-4,-2 19790303,6.6,3.6,-1.8,-4,-5.4,0 19790304,6.6,6.4,0.3,0.6,1,2 19790305,6,7.8,6.2,0.8,7,4 19790306,6,6.8,6.2,0.8,6.2,10 19790307,5.6,5.4,4.8,-0.6,6,12.4 19790308,4,7.5,4.5,0,5,10.4 19790309,6,6.8,1,-1,3.6,9.4 19790310,9,6.8,1.8,-1,0.6,5 19790311,9,5.6,3,1.6,3.4,5 19790312,9,7.8,1,4.6,2.6,6.6 19790313,8.6,8,2.6,3.8,3.4,7.4 19790314,4.4,7.2,0.6,-5.8,4.6,10 19790315,2.6,5.8,-0.8,-7.2,-0.4,5 19790316,2.4,3,0.2,-7.2,-1,0.6 19790317,5.6,6.6,0.9,-3.2,3,8 19790318,5,6.2,0,-5.6,2.6,6 19790319,5.2,7.4,0.4,-5,3.4,8 19790320,5.6,6.2,1,-6.2,1.6,5 19790321,5.6,5,0.6,-6.2,2.4,6.4 [... continues]
Note that a reference observational dataset ("GSN Iberia") is included in this repository, corresponding to a subset of the GSN station dataset for the Iberian Peninsula.
downscaleR - Santander MetGroup (Univ. Cantabria - CSIC)