-
Notifications
You must be signed in to change notification settings - Fork 9
This section illustrates answers to frequently asked questions with brief examples.
When each integer needs to be formatted with the same number of digits, Jinja code is used to adapt the string in the JSON schema.
The following example expands the string variable hisco
to 5 digits by adding leading zero's:
"aboutUrl": "hisco-code:{{'%05d'|format(hisco|int)}}"
If you only want to format positive values for the hisco
string variable, use conditional formatting as follows:
"aboutUrl": "hisco-code:{% if hisco|int > 0 %}{{'%05d'|format(hisco|int)}}{% else %} hisco {% endif %}"
When data is missing or unknown, cells are often left empty or recorded as NULL. Often these values need to be skipped when converting to Linked Data. Depending on the specification in the JSON schema, NULL values in different triple positions are left out.
Sometimes certain values in CSVs need to be skipped when converting to Linked Data. These so-called NULL values can be specified in the JSON schema under a column as "null"
:
{
"name": "ht_feet",
"null": ["999", "0"],
"datatype": "xsd:integer",
"dc:description": "height in feet",
"titles": ["ht_feet"],
"propertyUrl": "microheights:dimension/ht_feet",
"@id": "https://iisg.amsterdam/__af_ke_1890_milit_2009_moradi.dta.csv/column/ht_feet"
},
In this example individuals with a ht_feet of "999" or "0" are skipped. Since CoW reads NULL values as strings, remember to use quote marks.
Virtual columns can create dates based on multiple columns in the CSV. When one of those columns contains a NULL value, the generation of a triple can be prevented using a @list
of conditions.
{
"virtual": "true",
"null": {"@list": [{"name": "birth_year", "null": ""},
{"name": "birth_month", "null": ""},
{"name": "birth_year", "null": ""}]},
"datatype": "xsd:date",
"dc:description": "Birth Date",
"propertyUrl": "schema:birthDate",
"csvw:value": "{{['%04d'| format(birth_year|int),'-','%02d' | format(birth_month|int),'-','%02d'|format(birth_day|int)]|join}}"
},
If none of the date variables (birth_day
/birth_month
/birth_year
) are empty, the JSON schema will result in a triple. If at least one of the variables (columns) in the list is empty ("null": ""
), no birthDate
triple will be generated for the corresponding row in the CSV. In this example NULL is specified as an empty cell (""
), but this can be any string. For an explanation of the formatting of the result, see dates.
Virtual columns can also refer to only a part of a date in a single column. Since the NULL value is checked for a virtual column, the JSON schema still requires a @list
even for a single value from a variable (column).
{
"virtual": true,
"null": {"@list": [{"name": "byear", "null": ""}]},
"datatype": "xsd:gYear",
"propertyUrl": "microheights:dimension/bdec",
"csvw:value": "{{[byear[0:3],'0']|join }}",
"dc:description": "decade in which birth was recorded"
},
If the birth year byear
has an empty cell, the decade in which the birth was recorded bdec
will not be recorded.
CoW is built on QBer, part of the datalegend ecosystem for historical statistics. For more tools, datasets and infrastructure please visit https://datalegend.net. datalegend is a work package within CLARIAH a back-to-back NWO funded large research facility (#184.033.101)