Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update get_eurostat_json.R #262

Merged
merged 7 commits into from
Jun 20, 2023
Merged

Update get_eurostat_json.R #262

merged 7 commits into from
Jun 20, 2023

Conversation

ake123
Copy link
Contributor

@ake123 ake123 commented Jun 9, 2023

Return informative get_eurostat_json error messages to the user #261

@pitkant
Copy link
Member

pitkant commented Jun 9, 2023

It seems that the problem described in #260 is now solved. Great. However, I couldn't use examples mentioned there to debug this new code. Boo.

I could do something by providing a wrong parameter to the API. E.g. this erroneous query (it uses fromTimePeriod instead of the correct sinceTimePeriod):

get_eurostat("TIPSHO40", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))

Procudes this url: https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/TIPSHO40?fromTimePeriod=2020&untilTimePeriod=2021&format=JSON&lang=EN
Which produces the following error:

{ "error": [{"status": 400,"id": 150,"label": "INVALID_QUERY_DIMENSION: Query is invalid as per its structure's definition. Dimension \"FROMTIMEPERIOD\" is not defined"}]}

When I do the query in R, I get the following output:

get_eurostat("TIPSHO40", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Request failed [400]. Retrying in 1.7 seconds...
Request failed [400]. Retrying in 4 seconds...
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  The requested url cannot be found within the get_eurostat_json function:
                  Client Error - 100 No results found 
                  Description - The requested resource is not available. https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/TIPSHO40?fromTimePeriod=2005&untilTimePeriod=2007&format=JSON&lang=EN

So it actually gets the HTTP status correctly in the old code ("Request failed [400]") but it prints a wrong message after it ("Client Error - 100 No results found").

Could you fix the code contribution so that

  1. The http request would read and save the error message (http status, id and label) from the response.
  2. The message would return the status, id and label that were saved in the memory. Now it's returning hardcoded messages.

Reading error message (http status, id and label) from the response and return the status, id and label that were saved in the memory.
@ake123
Copy link
Contributor Author

ake123 commented Jun 12, 2023

Solved! Feel free to comment!

@pitkant
Copy link
Member

pitkant commented Jun 12, 2023

Thanks, it works quite nicely now:

> get_eurostat("TIPSHO40", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Request failed [400]. Retrying in 1 seconds...
Request failed [400]. Retrying in 3.1 seconds...
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  Status : 400, error id : 150, label : INVALID_QUERY_DIMENSION: Query is invalid as per its structure's definition. Dimension "FROMTIMEPERIOD" is not defined

Maybe it would be nice to add error message and error id labels in writing as well? Such as 400 Bad request, 100 No results found, 140 Syntax error, 150 Semantic error etc. These would make it easier to google for help and find Eurostat website.

Also, when querying for a dataset that does not exist / is not otherwise available maybe it would be helpful to print the old note stored in msg object:

   msg <- ". Some datasets are not accessible via the eurostat
          interface. You can try to search the data manually from the comext
  	  database at http://epp.eurostat.ec.europa.eu/newxtweb/ or bulk
  	  download facility at
  	  http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing
  	  or annual Excel files
  	  http://ec.europa.eu/eurostat/web/prodcom/data/excel-files-nace-rev.2"

Especially if in some special cases when content(resp)$error is empty / does not exist:

get_eurostat("", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  Status : , error id : , label : 

So there could be a check for that as well.

These examples could also be added to eurostat package tests.

@ake123
Copy link
Contributor Author

ake123 commented Jun 13, 2023

Now updated with the recommended changes

R/get_eurostat_json.R Outdated Show resolved Hide resolved
R/get_eurostat_json.R Outdated Show resolved Hide resolved
R/get_eurostat_json.R Outdated Show resolved Hide resolved
Copy link
Member

@pitkant pitkant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed your code with the line-specific comments that should be listed here.

Additionally, the empty status message problem still persists and there are no unit tests checking if things work.

> get_eurostat("", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  Status :   , error id :   No results found , label :  

@ake123
Copy link
Contributor Author

ake123 commented Jun 14, 2023

The error message is now changed to

> get_eurostat("", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors, :
Failure to get data. Status code: 404 Some datasets are not accessible via the eurostat
interface. You can try to search the data manually from the comext
database athttps://ec.europa.eu/eurostat/comext/newxtweb/ or API data query at
https://wikis.ec.europa.eu/display/EUROSTATHELP/API+Statistics+-+data+query

@pitkant
Copy link
Member

pitkant commented Jun 20, 2023

Alright, thank you for your work, I'll merge the PR

@pitkant pitkant merged commit dcad4e1 into rOpenGov:v4-dev Jun 20, 2023
@pitkant pitkant mentioned this pull request Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants