This project contains the procedures to reproduce the results of the paper A method to produce metadata describing and assessing the quality of spatial landmark datasets in mountain area, M.-D. Van Damme, A.-M. Olteanu Raimond
README Contents
- Institute: LASTIG, Univ Gustave Eiffel, ENSG, IGN
- License: CC0-1.0 license
- Authors:
- Marie-Dominique Van Damme
- Ana-Maria Raimond
All the steps described below concern the camptocamp.org data source. To get the results of the other data sources (OpenStreeetMap.org, Refuges.info, rando.ecrins-parcnational.fr and rando.parc-du-vercors.fr), it will be necessary to adapt the link of dataset to download and the table names in the SQL scripts.
These instructions will be executed before the first or the second reproducing that follow.
-
Input data :
- initial datasets:
- data matching links results: matching_result_Camptocamp_and_BDTOPO.csv
- alignement file between the dataset and the OOR ontology: Alignment_Camptocamp_OOR.csv
-
Coding environnement: PostGreSQL/POSTGIS
-
Steps to follow
-
Step 1: create a database in PostGreSQL:
CREATE DATABASE agile_metadata_2022
-
Step 2: install postgis extension for this new database (see Extension menu)
-
Step 3: import all the needed data in the postgres database: run the SQL script sql/0_loading_data.sql
-
-
Run the first request in the script SQL sql/1_confidence.sql to get the DQ_confidence for all the scope.
-
Note: the two other scripts compute the DQ_confidence for a subset of the types. This is an example for on demand metadata; for example if the user needs to assess only the confidence of the matching algorithm for a specific types of landmarks (e.g. those corresponding the the ontology class "isolated accomodation")
- Run the script SQL sql/2_spatial_accuracy.sql
- Run the script SQL sql/3_confusion_matrix_all.sql
- Import the result in a tabular software like Excel or OpenOffice
- Create a cross table: the values of the first column correspond to the line, the values of the second column correspond to the column and the values of the third column correspond to the quantitative values of the cross table
- Several couple of values, not in the diagonal, are correctly classified items:
(lieu-dit, col), (lieu-dit, croix), (lieu-dit, massif_boisé), (lieu-dit, rocher), (lieu-dit, surface_neige_et_glace), (lieu-dit, vallée), (lieu-dit, abri), (lieu-dit, hébergement_isolé), (hébergement_isolé, abri), (hébergement_accessible, gîte), (hébergement_isolé, refuge), (gite, refuge), (abri, refuge), ((vide), lac), (dépression_fermée, grotte) these pairs are used also to compute overall accuracy. - The overall accuracy is the sum of items on the main diagonal + items correctly classified) divided by the sum of all items from the matrix
- Run the script SQL sql/4_duplicate_all by changing each time the name of the dataset table
- Run the script SQL sql/5_Samal_distance.sql
- Run the script SQL sql/6_missing_class.sql
-
You have to create a worksheet in a tabular software like Excel or OpenOffice
-
Prepare the worksheet by creating these columns:
-
Run each request in the script SQL sql/7_completeness.sql and put the result column per column
-
Calculate the sum of elements in each column, for example the sum are stored in line 115.
-
Then, you have:
- Excess = (B115+C115)/E115
- Missing items = F115/H115
- Input ressources:
- The six dataset files : Five spatial landmark datasets" downloaded on the plateform Zenodo (version 1.0)
- The five files : Alignment between type of landmark in different sources and the concept in the spatial reference objects ontology
- The current Java project “QualityMetadataSpatialLandmarkDataset”. There is not need You don’t need to install MultiCriteriaMatching code. It is a depedency library of the project QualityMetadataSpatialLandmarkDataset (maven project).
- Java Install:
- Download and install the Java Development Kit (JDK) (jdk 8) from the Oracle website
- Eclipse
- Download and install the IDE Eclipse
- Download the project QualityMetadataSpatialLandmarkDataset on your local system
- Import the project in Eclipse like a maven project
- drop the six landmark dataset files in the data/dataset folder
- drop the five alignements files in the data/alignment folder
Launch the Java main file MainMatchingCamptocampBdtopo.java. This program loads data and computes the matching links between the sources of datasets and the BDTOPO dataset.
At the end of the computation, the data are matched and the results are stored as a CSV file (e.g. c2c-bdtopo-XXX.csv”) in the resultat folder. Note : rows one (1:0) and two (1:1) of Table 3 are obtained directly from the Java console print
- Launch QGIs
- Install the plugIn
- Copy the visu_valide_MultiCriteriaMatching plugIn in the QGIS folder
- Open QGIS;
- In the QGIS Extensions look for the visu_valide_MultiCriteriaMatching plugIn; a small icon is added on the QGIS ‘s plugIn toolbar
- Execute the plugIn visu_valide_MultiCriteriaMatching by click on the button in toolbar
- Import the file creating in the step before.
- Validate landmark by landmark
- The results are store in a shapefile
-
Input data :
- data matching links results: matching_result_RefugesInfo_and_BDTOPO.csv
-
Coding environnement: PostGreSQL/POSTGIS
-
Steps to follow
- Step 1: create a database in PostGreSQL:
CREATE DATABASE agile_metadata_2022
- Step 2: import all the needed data in the postgres database: run the SQL script sql/p3_0_loading_data.sql
- Step 1: create a database in PostGreSQL:
-
Run the SQL script p3_1_boxplot_samal_distance.sql
-
Export query result to .csv file, for example distances_samal.csv
- This is an example to create a boxplot with R software:
x <- read.csv("/home/glagaffe/distsamal.csv",header=T, sep=",") boxplot(x, xlab="Refuges.info", ylab="Samal distance", main="Boxplot of Samal distance for names in Refuges.info source")