-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
World Countries update #188
Conversation
💔 Build Failed |
💚 Build Succeeded |
An option we can also check is to replace Wikidata area/pop by World Bank data. In the exported datasets the ISO3 code is added so this should be quite straight forward. A clear benefit is a well-defined origin of information, available for several years, also in an Open License, and with many more indicators available. @nickpeihl let me know if you think we should explore this path and I'll revert to a Draft PR. |
I think it's worth exploring sourcing the popluation data from the World Bank as an authoritative source. |
Admin regions dataset needs some further work, there are some missing ISO3 codes
On the other hand the World Bank offers a convenient API to get data per metric we can later join using the current Mapshaper workflow.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. I did a preliminary review of the data.
The updated world countries dataset is missing the following ISO-2 codes that are in the current production dataset. I suspect many of these countries have no subdivisions in the admin regions dataset.
"AX"
"BQ"
"BV"
"CC"
"CX"
"GF"
"GP"
"MQ"
"RE"
"SJ"
"YT"
You can see this for yourself with comm -23 <(git show master:data/world_countries_v1.geo.json | jq '.features[].properties.iso2' | sort | uniq) <(jq '.features[].properties.iso2' < data/world_countries_v1.geo.json | sort | uniq)
As an alternative, we could use the Admin 0 Countries layer from Natural Earth. The NE Countries and NE States Provinces layers use the same boundary lines for countries. Though there might be some minor differences between the EMS layers depending on the mapshaper simplification settings we use.
💔 Build Failed |
💔 Build Failed |
💚 Build Succeeded |
@nickpeihl thanks for the review! With last changes we have a complete World Countries dataset with all records with area and population except Antarctica and British Indian Ocean Territory. What I did is to fill the gaps from the World Bank dataset (for example both Sudan and South Sudan are empty) with the remaining information from Wikidata. I also added a But then 😅 I checked for the CIA Factbook and I found this project that parses the website and offers a single JSON with data from countries. The ISO 3166 code is available on the Internet section.
I like WB data because we don't rely on a scrapped HTML to source our data, but let me know if CIA Factbook is still interesting in exchange of WB or Wikidata current usage. |
💚 Build Succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I like WB data because we don't rely on a scrapped HTML to source our data, but let me know if CIA Factbook is still interesting in exchange of WB or Wikidata current usage.
I agree. WB has a proper authoritative API which I prefer.
-
I also see that my comments here and here are antithetical. I'm starting to reconsider using the admin regions to create the world countries because we are deleting ISO codes that previously existed.
-
We should make sure metric fields like
population
andarea
don't show as join fields in Elastic Maps and region maps in older releases of Kibana.
From your previous #188 (review), these are the removed ISO codes from the PR data, all part of overseas territories from different countries:
|
💚 Build Succeeded |
@nickpeihl in f4129ef I've moved the dataset to the I still want to check for those ISO codes mentioned earlier today to see if they show up in the GeoIP database. I'll do that next week. |
Co-authored-by: Nick Peihl <nickpeihl@gmail.com>
Co-authored-by: Nick Peihl <nickpeihl@gmail.com>
Co-authored-by: Nick Peihl <nickpeihl@gmail.com>
Co-authored-by: Nick Peihl <nickpeihl@gmail.com>
Co-authored-by: Nick Peihl <nickpeihl@gmail.com>
…le-service into 179-world-countries
@nickpeihl sorry for the extra commits, I did a rebase from master to include the Italy Provinces change but I should have merged 😓 I've also updated |
💚 Build Succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final changes lgtm! thanks.
fixes #179
fixes #164
This PR updates the World Countries dataset with a more detailed GeoJSON and TopoJSON datasets. It also updates the Admin Regions dataset to include country data.
The size of the World Countries dataset is mostly controlled by the
interval
parameter of thesimplify
command for Mapshaper. Current intervals produce a 1.8MB GeoJSON and a 2.2MB TopoJSON (uncompressed).The
README.md
file of thesources/world
folder go in detail also on how to query Wikidata to generate a CSV file with population and area for each ISO2 code that is joined with Mapshaper.This is the visual result for the layer compared with the current production dataset, including actual download size.