Open
Description
Hi Michal,
Thanks for releasing v0.4.0, I updated R and eurlex and i am using it.
I recently used elx_dowload_xml and I wanted to suggest some improvements:
- line 28 should likely be :
notice type must be correctly specified" = notice %in% c("tree", "branch", "object"))
(this is more of an issue) file = basename(url)
could befile = paste(basename(url), ".xml)"
- With the current settings when
object
is passed to notice the object expression notice is retrieved (p 44 of cellar), however this does not contain metadata. I'd suggest to drop the language header and use?language=
a the end of the url whenobject
is passed (p 42 of cellar), so that the object notice with the object metadata is retrieved. - elx_dowload_xml could encapsulate a function that returns the xml notice as a string. So a user could decide wether to directly dowload the xml notice, or to get the xml notice as a string an parse it to get other fields and complement the make_query and run_query functions.
- About elx_make_query, you remember that there was the issue of the 10e6 limit? A workaraound/improvement could be to group together multiple items of the same property of a work. e.g. if i pass
include_authors = TRUE
, it could help to use(group_concat(distinct ?author_;separator=", ") as ?author)
in theselect
statement andOPTIONAL{?work cdm:work_created_by_agent ?author_.}
in thewhere
statement of the sparql query. The uri would still be inside, but i see this less of an issue to clean it afterwards. This would help in not having duplicated works when running queries.
What do you think about theese?
All the best