-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganise docs #368
Reorganise docs #368
Conversation
Use the 'home.md' file, renamed to 'index.md', as the "index page" (that is, the page shown when the visitor does not request an explicit page), instead of the LinkML-generated index page (which is not suitable to be the first thing the visitor sees upon arriving on the website). The LinkML-generated index page is renamed to 'linkml-index.md'. The name does not really matter, as long as it is *not* 'index.md'. This requires updating LinkML to version 1.7.0 at least, because prior versions hardcoded the name of the generated index page to 'index.md', thereby forcing that page to be the site index page (unless the web server is configured differently). This in turns requires bumping slightly the minimum Python version from 3.8 to 3.8.1.
The contents of the "About" page (about.md) was redundant with the index page, so we remove it. We rewrite the index/about page to: * separate the description of the standard from the description of what the Mapping Commons project does (previous description was conflating the two things; for example, providing reference tools and software libraries is *not* part of the standard; it's part of the efforts to promote the use of the standard); * fix the basic description of what a SSSOM mapping is, and also add a mention of what a "mapping set" (the second most important core concept) is; * slightly re-organise the list of "quick links".
Update the index page to: * add the SSSOM logo on top (it makes more sense to put it there than at the top of the "overview" page; visitors will see it first); * rename the top section "SSSOM at a glance", and add to it an example of a file in the SSSOM/TSV format; the idea of that section is to give a quick overview of what we are talking about (so that the readers can decide immediately whether SSSOM is what they were looking for); * add infos about the team (contact and list of editors/contributors); * add acknowledgements section for listing funding sources and significant contributions. The last two points are moved from other pages of the doc (notably contact.md, credits.md). Better to have them on the first page so that they are out the way.
Split the existing "spec.md" file into two components: - a general introduction on mappings; - the actual specification, which is itself split in several parts: - the specification of the data model; - the specification of the serialisation formats. This commit creates placeholder files that will hold those different sections. The "general introduction" file is pre-filled with the contents of the "Introduction" section of the original "spec.md" file. (I believe that introduction should be entirely re-written from scratch, as it sometimes reads like a patchwork of unrelated pieces pasted together. But that will be for later work. The most urgent for now is to have a place where we can write the actual *specification* with all its details.)
We update the link to the logo used at the top of the index page to point to a local copy of the logo, rather than to its original online location. This insulates the documentatiom from any unexpected change in said original remote location.
Replace the placeholder links on the index page by actual links to the appropriate sections of the documentation.
Add an introductory paragraph at the beginning of the "specification" section, along with a paragraph (copied from BCP 14) explaining the meaning of the MUST/SHOULD/etc. keywords. Add an overall overview of the data model and a subsection explaining what the "propagatable slots" are.
At the beginning of the specification, we add a table with the list of all the prefix names that are used throughout the specification. This will also act as the list of "built-in" prefix names, which will be referred to from the spec of the SSSOM/TSV format.
In the section about the data model, we add a list of the mapping predicates that are considered common and that are recommended. This is mostly taken from the old spec.md document, except that we also mention the predicates defined in the SEMAPV vocabulary, that the old spec was not mentioning at all.
Add a complete and workable specification for the SSSOM/TSV format. This is an original version that takes very little from the old spec.md document, except the examples. The specification for the OWL/RDF format, on the other hand, is directly taken from the old spec, almost "as is". The "specification" for the JSON format is currently merely a placeholder, since that format is NOT specified for now.
All bits of the old spec.md document have now been moved (or rewritten) elsewhere, so we can remove it.
The `code_of_conduct.md` file is an exact duplicate of `contributing.md`, so we remove it.
Fix some typos, missing words, and inconsistent names of placeholder variables in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely fantastic work @gouttegd
I think our main disagreement right now is the condensation requirement. I love that the concept of condensation was introduced, I love the name, and I love the specification - I just disagree with mandating it in "strict" mode at the moment, or encouraging it in "non-strict" mode. (Unless, of course, I have already agreed to it elsewhere and just forgot about it)
I hope we did
When showing examples of SSSOM/TSV files (on the index page and in the spec for the SSSOM/TSV format), use the mapping set from the "basic tutorial".
Add a section listing YAML features that MUST NOT be used in the metadata section of the SSSOM/TSV file. Those features are not uniformly supported even among high-quality YAML implementations and do not bring much. SSSOM is supposed to be _simple_, so we forbid them entirely.
Add a requirement that condensation, when supported, MUST be deactivatable. Also clarify that propagation and condensation go together, so that an implementation that supports one MUST support the other.
Amend the canonical rule for serialising floating point values with UP TO 3 digits after the decimal point AS NEEDED. That is, if more than 3 digits would be needed to write the value, then the writer MUST truncate after the third digit, but if the value can be written (without loss of precision) with less than 3 digits, the writer MUST NOT right-pad the value with zeroes. So a value like 0.9 is to be written as "0.9", NOT as "0.900".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve:
- The reorganisation overall
- The general sentiment expressed by the spec-formats sections that are new
I reserve myself the right the revisit details on the implementation side if (and only if) putting them into action will result in a violation of current common practice. I don't think there is anything, I just want to be super transparent.
Thank you @gouttegd I am happy with this. When you are happy, what do you suggest:
- Merge bypassing the need for a second review on the grounds that little new is added to the spec (only clarifications and interpretation of the spirit)
- Me to find a second reviewer to fulfill the 2-reviewer requirement
I will leave the choice to you, I am ok with either.
Given that the “reorganisation of the docs” part does not actually change any content (it merely puts the doc in a shape that it will make it easier to work on it), I am fine with that part not being reviewed by a second reviewer. For the I don’t foresee any problem since the new spec should be fully compatible with existing behaviours in SSSOM-Py. What the new spec does add:
|
I feel myself responsible for the sssom py implementation, even though the majority of the work has been done by @hrshdhgd. @hrshdhgd - feel free to review the file called Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so thorough , I love it! Thank you so much @gouttegd for putting this together. It certainly is a lot of hard work and we are sincerely appreciative of the same!
I don't think I follow the condensation and propagation concepts though. Could either of you provide examples so I understand what to implement?
Let’s consider the following set:
The set-level metadata contain a value for the So in this example, all mappings should be considered to have a Propagation is the act of taking the values of propagatable slots at the set level, and filling the corresponding slots in each individual mappings. After propagation, the above set should look like this:
Notice that that the set no longer has a Also note that the value of the |
Condensation is the exact opposite of propagation. It’s taking the values of “propagatable slots” that are set on the mappings, and moving them (if possible, that is if all mappings have the same value) to the level of the set instead. For example, to condense the second example from my previous message, you would observe that all mappings have the same value for the |
That makes perfect sense! Thank you for explaining this patiently and perfectly @gouttegd ! I truly appreciate it. |
Resolves [#305] - [x] `docs/` have been added/updated if necessary - [x] `make test` has been run locally - [ ] tests have been added/updated (not applicable) - [x] [CHANGELOG.md](https://github.com/mapping-commons/sssom/blob/master/CHANGELOG.md) has been updated. If you are proposing a change to the SSSOM metadata model, you must - [ ] provide a full, working and valid example in `examples/` (**not applicable**: no new example needed as the change only affects how some slots should be interpreted; it does not add or remove slots, nor does it change how the propagated slots are used) - [x] provide a link to the related GitHub issue in the `see_also` field of the linkml model - [ ] provide a link to a valid example in the `see_also` field of the linkml model (**not applicable**, same reason as above) This PR finalises the fix to #305, by explicitly specifying, directly within the LinkML model, which slots are considered “propagatable” (previously this was only informally described in the spec, since #368). This is done by: * adding a “metamodel extension class“ (`sssom:Propagatable`) with a single boolean-ranged attributed `propagated`; * amending the slots that must be considered propagatable by making them instantiate the `sssom:Propagatable` extension.
Resolves [#330]
docs/
have been added/updated if necessarymake test
has been run locallyThis PR reorganises the documentation, especially the specification part, as suggested in #330.
More precisely:
spec.md
document)The “resources for users” section is left untouched for now. The urgent part was reorganising the specification, so that we can start enriching it to make it ready for 1.0.