Skip to content

doi registrant code is too restrictive in the schema #910

Closed
@schwehr

Description

@schwehr

In extensions/scientific/json-schema/schema.json

"sci:doi": {
          "type": "string",
          "title": "Data DOI",
          "pattern": "^(10[.][0-9]{4,}(?:[.][0-9]+)*/(?:(?![%\"#? ])\\S)+)$"
        }, 

This is too narrow: [0-9]{4,}

https://www.doi.org/overview/DOI_article_ELIS3.pdf

a unique alphanumeric string assigned to an organization
that wishes to register DOI names (four digit numeric codes
are currently used though this is not a compulsory syntax).
The registrant code is assigned through a DOI registration
agency, and a registrant may have multiple-registrant
codes. 

https://www.doi.org/doi_handbook/2_Numbering.html#2.2.2

The registrant code is a unique string assigned to a registrant.

So my best guess at what the doi regex should be is this based on the alphanumeric statement in the pdf.

          "pattern": "^(10[.][0-9a-zA-Z]+(?:[.][0-9a-zA-Z]+)*/(?:(?![%\"#? ])\\S)+)$"

So this should be a valid doi if the prefix was registered: 10.123abc.foo.bar/issn.1476-4687/this/is/nuts

I'm not sure what the suffix part of the pattern will match: (?:(?![%\"#? ])\\S)+)

https://json-schema.org/understanding-json-schema/reference/regular_expressions.html

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions