-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
code duplication issue between rdflib and pymicrodata #582
Comments
This is a "known issue" At the time @iherman wanted to keep the code BOTH in a separate repo and in RDFLib, due to the way pymicrodata (and it's exactly the same for the rdfa code) was deployed at w3c. I was skeptical, but he assured me he would keep them in sync :) Since I am a "unionist" and like having all the stuff in one repo (not the npm disease :) ), I vote removing the separate project and keeping only the rdlilb one. |
hmm, while https://github.com/RDFLib/rdflib/tree/master/rdflib/plugins/parsers/pyMicrodata and https://github.com/RDFLib/pymicrodata/tree/master/pyMicrodata are clearly meant to be the same, i'm not entirely sure what to do about the docs and scripts in https://github.com/RDFLib/pymicrodata . sounds like a clear argument for an own package to me. Main problem with that would be docs and intra-package tests, as we recently experienced with SPARQLWrapper (a change there broke rdflib's tests quite a while later...) |
The reasons I do not want formally merge is internal to the way things are deployed at W3C and the time I devote to this these days, to be frank. For example, I use the RDFLib installed by the system team on the service, which is never 100% up to date. Etc. Let us close this particular issue... (The same holds for the RDFa service vs parser, b.t.w.) |
I appreciate your concern about deployment at w3c @iherman , but RDFLib has many many other users than w3c - there has already been many problems because of this slightly odd setup. All RDFLib maintainer work on voluntarily in their spare time, I am very much interested in having my hobby be as smooth and painless to work on as possible, and I cannot really set off extra time to do things awkwardly to support the deployment politics of the W3C. We need to solve this issue somehow NOW - if the easiest way for you to do any pyMicrodata/RDFa work is to keep them in separate projects, lets make the RDFLib plugins. We also discussed #391 I was always against re-separating the projects, since 1. it was a bit of work to move them together, it'll be work again to take them apart, work that could be better spent actually fixing bugs and implementing features. (Re: my hobby again, it is: making the library that makes working with RDF data as easy as possible, not refactoring) 2. I remember the massive support-effort it was to explain to everyone about rdflib-sparql etc. Number 2 may well be better now - more people (hopefully) use pip or similar, and everyone is already used to npm etc. to install 10,000 small packages to get anything done. |
;) |
I believe that boat has sailed, insofar as I think it would be a major mistake removing the parsers from the core RDFLib distribution (users probably did not bother installing separate libraries if it was part of the core). Whether it was a mistake to incorporate them into the core or not back then is an academic discussion. (And, frankly, I do not even remember how we got there, it was eons ago for me, having moved out of the active RDF world...) We can declare the separate microdata and RDFa libraries are closed and not maintained any more, directing people towards the core distribution, and solve possible problems only for the embedded parsers. I am fine with that; if there are problems popping up in those versions it will become my problem (or problem of those who take over the maintenance at W3C if I move on). I would probably clone those repositories under my name simply to maintain history and use version control for my own purposes. (B.t.w., the separate versions are not plugins. The contain the same parsing library, and an application layer on top, but I am not sure it contains the plugin 'binding'. In this sense, they are utilities on top of RDFLib rather than plugins.) |
@RDFLib/owners are there any use cases for installing rdflib without pip, conda or any other package manager? |
ok, so as you might've noticed we're nearing an end wrt. #443. As the new parser changes the interface and returns different results when parsing the same file, i have to make it a new major release of rdflib in terms of http://semver.org ... i'm not afraid of bumping that number, but as you might have noticed, this causes me to move a lot of issues to milestone 6.0.0 (such as this one), as they on their own have the potential to change rdflib in a backwards incompatible way. As a new major release is a signal to system maintainers that things changed in a backwards incompatible fashion, doing this too often is not really good style. It's also a very strong reason to externalize individual changing parsers into sub-packages again, allowing to use old parsers with an updated rdflib core to not break your parsing results... The obvious downside of that is that splitting causes a lot of redundant work (making sub-packages, setting up travis, where to run the tests, specifying which version combinations are supported, etc.) and divides the few developers even further. I'm still undecided what's best here, also "best" relies on how often upstream format specs change in a backwards incompatible way... let's hope they rarely do and this was an exception... |
I believe this is an exception, at least for these features. On the RDFa side, RDFa1.1 is a Recommendation. This means changing it is not permitted only if a fully new Working Group is formed, and the charter of that one may require backward compatibility (just as RDFa1.1 is backward compatible with RDFa1.0, even if some RDFa1.0 features are now obsolete). Microdata is a little bit different, insofar as that is only a Note. For various reasons it is not a Rec, and it will not be, primarily because the HTML WG has not published it as a Recommendation. The update of the Note was done to align it with the way schema.org http://schema.org/ uses microdata; taking into account that schema.org http://schema.org/ is the only major user of microdata, but also that it has a major deployment by now, I do not expect anything will be changed in a backward incompatible way either. Of course, I cannot guarantee anything:-) Thanks! |
This was closed via #828 and is in RDFLib 5.0.0 release, but this issue is tagged as 6.0.0. |
There seems to be a code duplication problem between https://github.com/RDFLib/rdflib and https://github.com/RDFLib/pymicrodata which lead to problems around #443 before. There are several ways to solve this:
Once #443 is solved, i'd love to move that tick mark to some other box...
The text was updated successfully, but these errors were encountered: