Skip to content

suggestions for improving elx_dowload_xml and make query #26

Open
@michelemaroni

Description

Hi Michal,

Thanks for releasing v0.4.0, I updated R and eurlex and i am using it.
I recently used elx_dowload_xml and I wanted to suggest some improvements:

  1. line 28 should likely be : notice type must be correctly specified" = notice %in% c("tree", "branch", "object")) (this is more of an issue)
  2. file = basename(url) could be file = paste(basename(url), ".xml)"
  3. With the current settings when object is passed to notice the object expression notice is retrieved (p 44 of cellar), however this does not contain metadata. I'd suggest to drop the language header and use ?language= a the end of the url when object is passed (p 42 of cellar), so that the object notice with the object metadata is retrieved.
  4. elx_dowload_xml could encapsulate a function that returns the xml notice as a string. So a user could decide wether to directly dowload the xml notice, or to get the xml notice as a string an parse it to get other fields and complement the make_query and run_query functions.
  5. About elx_make_query, you remember that there was the issue of the 10e6 limit? A workaraound/improvement could be to group together multiple items of the same property of a work. e.g. if i pass include_authors = TRUE, it could help to use (group_concat(distinct ?author_;separator=", ") as ?author) in the select statement and OPTIONAL{?work cdm:work_created_by_agent ?author_.} in the where statement of the sparql query. The uri would still be inside, but i see this less of an issue to clean it afterwards. This would help in not having duplicated works when running queries.

What do you think about theese?

All the best

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions