Skip to content

Commit ae0ce68

Browse files
authored
SDG import scripts (datacommonsorg#850)
* add input files * add svs * add scripts for generating svs * fix * remove extra commas * add script for pv schema * generate new enums for each property * update schema * fix * fix * add schemaful svs * add city maps * update * cities * preprocess * updates * fix * fix * USE SUBMODULE * DELETE old scripts/files and update cities * update cities * some updates to test * delete file * update tests * add csv to lfs * update * fix * update modules * modules * tests * test * test * lint * tests * lint * strings * remove nan series * update variable codes * address comments * updates
1 parent 31a064f commit ae0ce68

File tree

639 files changed

+4925
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

639 files changed

+4925
-0
lines changed

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "scripts/un/sdg/sdg-dataset"]
2+
path = scripts/un/sdg/sdg-dataset
3+
url = https://code.officialstatistics.org/undata2/data-commons/sdg-dataset.git

scripts/un/sdg/.gitattributes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
csv/* filter=lfs diff=lfs merge=lfs -text
2+
dc_generated/* filter=lfs diff=lfs merge=lfs -text

scripts/un/sdg/README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# UN Stats Sustainable Development Goals
2+
3+
This import includes country-level data from the [UN SDG Global Database](https://unstats.un.org/sdgs/dataportal). Data is read from the submodule `sdg-dataset` which is managed by UN Stats.
4+
5+
6+
To generate city dcids:
7+
```
8+
python3 cities.py <DATACOMMONS_API_KEY>
9+
```
10+
(Note: many of these cities will require manual curation, so this script likely should not be rerun.)
11+
12+
To process data and generate artifacts:
13+
```
14+
python3 process.py
15+
```
16+
Produces:
17+
* schema/ folder:
18+
* measurement_method.mcf
19+
* schema.mcf (classes and enums)
20+
* sdg.textproto (vertical spec)
21+
* series.mcf (series mcf)
22+
* sv.mcf
23+
* unit.mcf
24+
* csv/ folder:
25+
* [CODE].csv
26+
(Note that the `schema/` folder is not included in the repository but can be regenerated by running the script.)
27+
28+
To run unit tests:
29+
```
30+
python3 -m unittest discover -v -s ../ -p "*_test.py"
31+
```

scripts/un/sdg/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)