This repository is the microservice that implements the HDX Index Adapter functionality.
First, make sure that you have the API gateway running locally.
We're using Docker which, luckily for you, means that getting the application running locally should be fairly painless. First, make sure that you have Docker Compose installed on your machine.
git clone https://github.com/GPSDD/hdx-index-connector.git
cd hdx-index-connector
./adapter.sh develop
```text
You can now access the microservice through the CT gateway.
It is necessary to define these environment variables:
- CT_URL => Control Tower URL
- NODE_ENV => Environment (prod, staging, dev)
This component executes a periodic task that updates the metadata of each indexed RW dataset. The task is bootstrapped
when the application server starts.
The task's implementation can be found on app/src/cron/cron and the configuration is loaded from the
config files
The field correspondence is based on the metadata object for a single package - I.E. this link.
In the HDX domain, a package entity may refer to multiple data files - identified as resources.
Given that this structure does not match directly to the API structure, we use the following logic to map the HDX domain structure to ours:
- Each HDX
packagetentatively corresponds to one API Highways dataset. - Within each
package, if there's one and onlyresourcewithformatof typeJSON, we use thatresourceon step 4. If not, we proceed to step 3. - Within each
package, if there's one and onlyresourcewithformatof typeCSV, we use thatresourceon step 4. If not, thedatasetstatus is set tofailedand no metadata is created. - We combine the
packagedata and the selectedresourcedata to generate metadata as described in the spec table below.
| Field in SDG Metadata | Field in HDX data | Value |
|---|---|---|
| userId | ||
| language | 'en' | |
| resource | ||
| name | package.title, or the name provided on the create request as fallback |
|
| description | package.resource.description |
|
| sourceOrganization | package.organization.title |
|
| dataDownloadUrl | 'https://data.humdata.org' + package.resource.hdx_rel_url |
|
| dataSourceUrl | 'https://data.humdata.org/dataset/' + package.name |
|
| dataSourceEndpoint | 'https://data.humdata.org' + package.resource.hdx_rel_url |
|
| license | Try to match the value of package.license to one of the accepted licenses, fallback to 'Other' |
|
| status | 'published' |
HDX datasets have tags associated with them, which this connector uses to tag the index datasets. The tags are
loaded from the HDX metadata response using the following JSONPath expression: $.result.tags[*].display_name
Additionally, each HDX dataset is tagged with the "HDX API" tag, and a tag to match
the RW API application to which they belong.
No graph tagging is done on HDX datasets.