You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dataset **tesem120** contains several single-category dimensions: <strike>_sex_ and _age_ are always "Total" and _unit_ is always "Percentage of active population"</strike>. **Correction**: That was true in the past, but at some point **tesem120** began including data by sex. This could also happen in the future with **age** or even **unit**. Because we are only interested in unemployment as a percentage of active population (PC_ACT) and we don't care about _sex_ or _age_, we need to create a subset of eurostat.jsonstat:
155
+
Because we are only interested in unemployment as a percentage of active population (PC_ACT) and we don't care about _sex_ or _age_, we need to create a subset of eurostat.jsonstat:
> Population Estimates (Persons in April) by Age Group, Sex and Year
580
+
> PEA01 - Population Estimates (Persons in April)
581
581
```
582
582
583
-
[Dataset PEA01](https://www.cso.ie/webserviceclient/DatasetDetails.aspx?id=PEA01) from CSO provides a yearly time series of population by sex. It is available in the JSON-stat format at:
583
+
[Dataset PEA01](https://data.cso.ie/table/PEA01) from CSO provides a yearly time series of population by sex and age group. It is available in the JSON-stat format at:
#### 2. Convert JSON-stat to a more popular JSON data structure
598
598
599
-
In this step, we will convert the JSON-stat file into an array of objects transposing dimension *Sex*. The dataset contains a dimension (*Statistic*) with a single category (*Population Estimates (Persons in April) (Thousand)*): we won't need it. Using **jsonstat2arrobj** like in previous examples:
599
+
In this step, we will convert the JSON-stat file into an array of objects transposing dimension Sex (*C02199V02655*). The dataset contains a dimension (*STATISTIC*) with a single category (*Population Estimates (Persons in April)*): we won't need it. Using **jsonstat2arrobj** like in previous examples:
600
600
601
601
```
602
-
jsonstat2arrobj ie.jsonstat ie.json --drop Statistic --by Sex --bylabel
The only difference between the previous two lines is that in the stream interface *ie.json* will be written even though it already existed while in the non-stream interface a new filename is used to avoid losing the content of an existing file.
@@ -620,13 +620,13 @@ First we need to convert JSON to [NDJSON](http://ndjson.org/):
620
620
ndjson-split < ie.json > ie.ndjson
621
621
```
622
622
623
-
Because we are only interested in data for the latest year (2019 at the time of writing), we need to apply this filtering condition:
623
+
Because we are only interested in data for the latest year (2020 at the time of writing), we need to apply this filtering condition:
624
624
625
625
```js
626
-
d.Year==='2019'
626
+
d['TLIST(A1)']==='2020'
627
627
```
628
628
629
-
We also want to remove the age total (*All ages*) and all the subtotals included in the dataset:
629
+
We also want to remove the age (*C02076V02508*) total (*All ages*) and all the subtotals included in the dataset:
630
630
631
631
*_15 years and over_
632
632
*_65 years and over_
@@ -648,21 +648,21 @@ They are not needed to build a population pyramid. One way to achieve this in Ja
648
648
'15 - 24 years',
649
649
'25 - 44 years',
650
650
'45 - 64 years'
651
-
].indexOf( d['Age Group'] ) <0
651
+
].indexOf( d['C02076V02508'] ) <0
652
652
```
653
653
654
654
The resulting filtering command is then:
655
655
656
656
```
657
-
ndjson-filter "d.Year==='2019' && ['All ages', '15 years and over', '65 years and over', '0 - 4 years', '0 - 14 years', '15 - 24 years', '25 - 44 years', '45 - 64 years'].indexOf(d['Age Group'])<0" < ie.ndjson > ie-filtered.ndjson
657
+
ndjson-filter "d['TLIST(A1)']==='2020' && ['All ages', '15 years and over', '65 years and over', '0 - 4 years', '0 - 14 years', '15 - 24 years', '25 - 44 years', '45 - 64 years'].indexOf(d['C02076V02508'])<0" < ie.ndjson > ie-filtered.ndjson
658
658
```
659
659
660
660
#### 4. Transform data
661
661
662
662
Many visualization tools do not have pyramids as a type of chart, because they are actually just a special case of a bar chart where the male values have negative values. This is the case of Google Sheets, the tool we are going to use. So the next step is to keep only the information we want and multiply male values by -1.
This line of code produces an SDMX-JSON file with the growth over the previous period of some key economic indicators for several locations. The size of _kei.sdmx.json_ is 393 Kb.
830
+
This line of code produces an SDMX-JSON file with the growth over the previous period of some key economic indicators for several locations. The size of _kei.sdmx.json_ is (at the time of writing) 430 Kb.
831
831
832
832
#### 2. Convert SDMX-JSON to JSON-stat
833
833
@@ -837,15 +837,15 @@ This line of code produces an SDMX-JSON file with the growth over the previous p
837
837
sdmx2jsonstat kei.sdmx.json default.stat.json
838
838
```
839
839
840
-
The JSON-stat file is smaller (232 Kb) than the original SDMX-JSON one. An even smaller file can be produced: by default, **sdmx2jsonstat** uses arrays to express values and statuses. JSON-stat supports both arrays and objects for this purpose. Because usually only a few data have status information, it is generally better to use an object for statuses.
840
+
The JSON-stat file is smaller (250 Kb) than the original SDMX-JSON one. An even smaller file can be produced: by default, **sdmx2jsonstat** uses arrays to express values and statuses. JSON-stat supports both arrays and objects for this purpose. Because usually only a few data have status information, it is generally better to use an object for statuses.
841
841
842
842
**sdmx2jsonstat** supports objects for status information using the _--ostatus_ option (_-s_).
843
843
844
844
```
845
845
sdmx2jsonstat kei.sdmx.json kei.stat.json -s
846
846
```
847
847
848
-
The new JSON-stat file is now only 168 Kb: less than half the original SDMX-JSON one.
848
+
The new JSON-stat file is now only 181 Kb: less than half the original SDMX-JSON one.
849
849
850
850
#### 3. Convert JSON-stat to CSV
851
851
@@ -855,31 +855,31 @@ Because now we have a regular JSON-stat file, it is trivial to convert it to CSV
855
855
jsonstat2csv kei.stat.json kei.csv
856
856
```
857
857
858
-
The new file is very big (1,174 Kb) because by default labels, instead of identifiers, are used. **jsonstat2csv** has several options to avoid this. But you don’t actually has to choose between labels or identifiers (each serves a different purpose): you can use the ([CSV-stat](https://github.com/jsonstat/csv)) format as the output format: CSV-stat supports the core semantics of JSON-stat using an enriched CSV structure.
858
+
The new file is very big (1,3 Mb) because by default labels, instead of identifiers, are used. **jsonstat2csv** has several options to avoid this. But you don’t actually has to choose between labels or identifiers (each serves a different purpose): you can use the ([CSV-stat](https://github.com/jsonstat/csv)) format as the output format: CSV-stat supports the core semantics of JSON-stat using an enriched CSV structure.
859
859
860
860
You can produce CSV-stat with the _--rich_ option (_-r_):
861
861
862
862
```
863
863
jsonstat2csv kei.stat.json kei.rich.csv -r
864
864
```
865
865
866
-
This command produces a 510 Kb file.
866
+
This command produces a 547 Kb file.
867
867
868
868
#### 4. Back to JSON-stat
869
869
870
870
```
871
871
csv2jsonstat kei.rich.csv default.json
872
872
```
873
873
874
-
The size of the new JSON-stat is 231 Kb: it is a little smaller than the original JSON-stat had some extension information that was lost in CSV-stat.
874
+
The size of the new JSON-stat is 241 Kb: it is a little smaller than the original JSON-stat had some extension information that was lost in CSV-stat.
875
875
876
876
This file can be minimized using objects for statuses, thanks to **jsonstat2jsonstat**:
877
877
878
878
```
879
879
jsonstat2jsonstat default.json kei.json -m -s
880
880
```
881
881
882
-
The size of the resulting file is 167 Kb.
882
+
The size of the resulting file is 181 Kb.
883
883
884
884
#### 5. Producing a key economic indicators CSV for a particular country
0 commit comments