-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common format for metadata and test specifications #3811
Comments
I would go for option 2, probably with a yaml-ld based representation. We can ask people to annotate the block in a certain manner, e.g., The test part, I would only support when a new w3id is created or updated, since there was a test before for all w3ids and it kept failing on older unrelated w3ids that are currently failing for some other reason. The other issue we may encounter is that if a repository has folders, some of the sub w3ids may be also maintained by the parent readme, or they may have READMEs with new maintainers. Finally the w3id |
|
Well, I think people just follow the instructions we put in: https://github.com/perma-id/w3id.org#creating-a-new-identifier :) |
As @dgarijo says,
I think it's good for those |
@simontaurus thanks for the recap. I actually prefer option 1 over the others, especially over option 2, I know that a loosely structured non-semantic format is probably not a popular opinion here 😄. I'd also add that we can simplify even more, without a new format: if we are willing to be GitHub/GitLab specific,
Looking at option 2: it has the same con as option 1, but worse: we'd still need a regexp to extract that block with a way more complex syntax within the block (fe. we'd need to escape Regarding the rule testing issue, I didn't investigate it in depth, but it makes sense to me to have the tests close to the actual rule.
I feel something this "complex" could lead us into a world of pain with maintenance. For starters, we'd need a validator, and even then, I bet the support requests will increase. "Why doesn't it work? Oh, there's a comma missing. Oh, it's Let's keep in mind sometimes those files are maintained by people new to the world of semantic data - or the world of open collaboration for that matter.
I think READMEs are really useful for describing the context and to serve as a sort of "landing page" for the path. I guess there should be a global rule to not publish them, but having those nicely formatted and editable within the GitHub UI makes them for a great answer to "what is this abstract/technical stuff all about?". |
As discussed in #3786 and #3801 there is a need for common format for test specifications and metadata.
At least three options are available, each with pros and cons:
Custom inline format in the source files
Example:
##TESTv1 '/mypath --header "Accept: text/html"' "https://my-target-domain.com/test.html
in the.htaccess
ormaintainer:https://github.com/abc123
in theREADME.md
file.Pro: Minimalistic, no duplications, users just extend existing files and adds additional information on intuitive locations
Con: Regex-based machine readability is limited and may lead to errors
Structured inline format in the
README.md
filemicroformat as suggested by bfabio. Should be RDF serialized e. g. as json-ld or yaml-ld since we are already in the linked data domain.
Example:
Pro: Embedded data is machine readable, no custom format, no additional file, data can be stored on multiple locations in the document
Con: More complex, embedded data needs to me extracted from the README.md file (e. g. filtering all code blocks with format json-ld, parsing, filtering by
@type
)like 2. but in a
meta.json
ormeta.yml
file.Pro: Machine readable without custom parsing, data could also be fetched by any linked data crawler
Con: Additional file to maintain, separation from other documentation, duplication of information
@davidlehn, @bfabio: What do you think?
The text was updated successfully, but these errors were encountered: