Skip to content

Make converting a large table of datasets to RDF faster #305

@jesper-friis

Description

@jesper-friis

In the current implementation each resource is converted separately from JSON-LD to RDF using rdflib. This is slow, both because the JSON-LD implementation in rdflib is slow and because it is very inefficient to call the JSON-LD API for each resource if you are documenting thousands of resources.

Improvements:

  • First step: Add a function for combines multiple single-resource dicts into one JSON-LD document with a "@graph"; [...] array in the root containing all the single-resource documents.
  • Second step: If it is still slow to document a large table of datasets after the first step, consider to add a method to the backends for uploading a json-ld file to the triplestore (for those triplestores that support json-ld). This may speed up things further, since it is expected that the json-ld implementation in production-ready triplestores like GraphDB is much faster than the rdflib-implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions