-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Derived (and augmented) dataset available in JSON, TSV and SQL formats #1281
Comments
I'll point here a few other issues that are solved by my derived dataset:
|
Thanks cipriancraciun but app I'm using throws a validation error trying to use the JSON file and online validators too indicate problems with format. |
@dnalkram What is the actual error you are getting? You could open an issue on my repository (https://github.com/cipriancraciun/covid19-datasets) so that we don't clutter the JHU repository with this. |
In Power BI I get |
@dnalkram Could you paste the URL you've used to download the file? The links in the description above are directly from GitHub, and you must use the raw URL from that page. To keep things simple, at the following link you can find the files on my own site, which shouldn't give you any issues:
(You can use these second links as I usually update them as soon as I push to GitHub.) If you still get errors, download that link and double check that you are actually getting a JSON and not an error HTML page. |
That file worked fine, thanks for all the help! |
THANK YOU! I have spent the past 3 days trying to wrangle the CCSSE daily reports into a parseable time series, but with the constant changes in data format and inconsistent location naming I was going insane. Looking forward to checking out your data. |
@sbw78 I'm glad I could help you. If you encounter any issues, please open a ticket on my repository and describe the issue. (Given the "quarantine" I usually reply fairly quickly.) :) BTW, in the interim I have integrated also the NY Times dataset and the ECDC one; thus if you are looking for alternative data you can choose one of these. |
Hi Ciprian, |
@ChrisParkerWA I just opened an issue on my repository (cipriancraciun/covid19-datasets#12), where I've proposed to add such a simple format, but perhaps with only a few more values. Please verify my proposal there and let me know if it works for you. |
In the interim I've also added SQL and SQLite DB files for all the datasets. |
Hello @cipriancraciun |
@AmauryVanEspen I've just opened an issue about this feature request on my repository: I would propose moving the discussion there, as this isn't strictly JHU related. (I would be open to such an API, however we must first understand exactly what its use-cases would be.) |
I have wrote some scripts that take both the series and daily reports files output the following two files:
If you want to automate the download (given how GitHub handles URL's to raw files), you can use the links listed on this page.
Also some plots for these available at:
What I've done:
infected
column which is computed asinfected := confirmed - deaths - recovered
; (this data is available up to 2020-03-22;)confirmed
,recovered
,deaths
andinfected
) I have added four additional metrics (i.e. in total 16 metrics):absolute_*
-- the original value from the JHU dataset, i.e. cumulative values;relative_*
-- the metric divided byconfirmed
in percentage; (I.e. how many recovered people from the total confirmed up to that date;)delta_*
-- the difference from the previous day; (in case ofinfected
the number can be negative;)deltapct_*
-- the delta divided by the previous day value; (i.e. the speed in percentage;)day_index_*
columns which represents the day index since that country / region has reached either1
,10
,100
, or1000
confrimed cases; (it helps align countries and compare them to that;)I will update these files twice per day, say at 06 UTC and 12 UTC.
Moreover I have also added the in the same format also the NY Times US dataset and the ECDC one.
The scripts are available in the following repository and consist mainly of
jq
snippets.If anyone has other ideas about what I can add to these augmented datasets please let me know.
The text was updated successfully, but these errors were encountered: