Skip to content

harmonize ORE, schema.org, and CodeMeta in DataONE packages #11

Open
@mbjones

Description

@mbjones

This issue is to discuss and decide on an architectural approach to include CodeMeta metadata documents as additional metadata within a DataONE data package alongside the other metadata such as EML or ISO that might be present for documenting data. This is useful when the package contains software such as R or python scripts in addition to the data in the package.

CodeMeta is a profile of schema.org, and is being harmonized to be completely congruent with schema.org. So, this discussion really revolves around how to integrate schema.org into DataONE packages, which would be nice considering that we are also providing schema.org in our dataset landing pages. The https://schema.org/Dataset structure in our landing pages in many ways is conceptually aligned with our ORE data package model, where the schema:Dataset plays the same role as ore:Aggregation. Our ORE files have other similar metadata as well, such as dc:identifier, as well as our PROV statements such as prov:wasDerivedFrom or prov:used. Consequently, we could easily consider our landing-page serialization of schema.org metadata to be a JSON-LD version of our current ORE package format. In fact, if we were to support JSON-LD as a serialization format for our packages, then the schema.org, CodeMeta, ORE, and PROV vocabularies could all be present and used in the same document, and so the package description would serializable as both JSON-LD as an ORE document or as JSON-LD as a schema.org file. My proposal, therefore, is that we:

  • support data package descriptions in both ORE (rdf+xml) and schema.org (JSON-LD) formats
  • index schema.org documents from the data package as resource maps, and let them play the same role as ORE documents
  • Provide a converter that can easily take the package description in either ORE or JSON-LD format and convert it to the alternative serialization
  • Embed all of this metadata in our schema.org landing page as JSON-LD, following the science on schema.org guidelines.

This would allow us to integrate package metadata, schema.org metadata, PROV metadata, and CodeMeta metadata all in a coherent model. We've been discussing this with @atn38, @srearl, @twhiteaker, and other people from EDI and LTER for some specific guidelines there, and I will include that conversation in the next comment for reference.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions