Quadstore is a LevelDB-backed RDF graph database for JS runtimes (browsers, Node.js, Deno, ...) with native support for quads and querying across named graphs, RDF/JS interfaces and SPARQL queries.
- Example of basic usage
- Introduction
- Status
- Usage
- Storage
- Data model and return Values
- Quadstore class
- Custom indexes
- Quadstore.prototype.open
- Quadstore.prototype.close
- Quadstore.prototype.get
- Range matching
- Quadstore.prototype.put
- Quadstore.prototype.multiPut
- Quadstore.prototype.del
- Quadstore.prototype.multiDel
- Quadstore.prototype.patch
- Quadstore.prototype.multiPatch
- Quadstore.prototype.getStream
- Quadstore.prototype.putStream
- Quadstore.prototype.delStream
- Quadstore.prototype.match
- Quadstore.prototype.import
- Quadstore.prototype.remove
- Quadstore.prototype.removeMatches
- Blank nodes and quad scoping
- SPARQL
- Browser usage
- Deno usage
- Performance
- License
import memdown from 'memdown';
import {DataFactory} from 'rdf-data-factory';
import {Quadstore} from 'quadstore';
import {Engine} from 'quadstore-comunica';
// Any implementation of AbstractLevelDOWN can be used.
// For server-side persistence, use `leveldown` or `rocksdb`.
const backend = memdown();
// Implementation of the RDF/JS DataFactory interface
const df = new DataFactory();
// Store and query engine are separate modules
const store = new Quadstore({backend, dataFactory: df});
const engine = new Engine(store);
// Put a single quad into the store using Quadstore's API
store.put(df.quad(
df.namedNode('http://example.com/subject'),
df.namedNode('http://example.com/predicate'),
df.namedNode('http://example.com/object'),
df.defaultGraph(),
));
// Retrieves all quads using Quadstore's API
const { items } = await store.get({});
// Retrieves all quads using RDF/JS Stream interfaces
const quadsStream = store.match(undefined, undefined, undefined, undefined);
// Queries the store via RDF/JS Query interfaces
const query = await engine.query('SELECT * {?s ?p ?o}');
const bindingsStream = await query.execute();
In the context of knowledge representation, a statement can often be
represented as a 3-dimensional (subject, predicate, object)
tuple,
normally referred to as a triple
.
subject predicate object
BOB KNOWS ALICE
BOB KNOWS PAUL
A set of statements / triples can also be thought of as a graph:
┌────────┐
KNOWS (predicate) │ ALICE │
┌─────────────────────────────────▶│(object)│
│ └────────┘
┌─────────┐
│ BOB │
│(subject)│
└─────────┘ ┌────────┐
│ │ PAUL │
└─────────────────────────────────▶│(object)│
KNOWS (predicate) └────────┘
A quad
is a triple with an additional term, usually called graph
or
context
.
(subject, predicate, object, graph)
On a semantic level, the graph
term identifies the graph to which a triple
belongs. Each identifier can then be used as the subject
or object
of
additional triples, facilitating the representation of metadata such as
provenance and temporal validity.
subject predicate object graph
BOB KNOWS ALICE GRAPH-1
BOB KNOWS PAUL GRAPH-2
GRAPH-1 SOURCE FACEBOOK
GRAPH-2 SOURCE LINKEDIN
Quadstore heavily borrows from LevelGraph's approach to storing tuples, maintaining multiple indexes each of which deals with a different permutation of quad terms. In that sense, Quadstore is an alternative to LevelGraph that strikes a different compromise between expressiveness and performance, opting to natively supporting quads while working towards minimizing the performance penalty that comes with the fourth term.
Quadstore's development is supported by Belay Engineering.
Active, under development.
See CHANGELOG.md.
We're currently working on the following features:
- optimizing SPARQL performance by pushing filters down from the engine to the persistence layer
We're also evaluating the following features for future developments:
- RDF* (see also these slides)
- uses Semantic Versioning, pre-releases are tagged accordingly;
- the
production
branch mirrors what is available under thelatest
tag on NPM; - the
master
branch is the active, development branch; - requires Node.js >= 14.0.0.
quadstore
can work with any storage backend that implements the
AbstractLevel interface. An incomplete list of available backends
is available at level/awesome#stores.
Our test suite focuses on the following backends:
classic-level
for persistent storage using LevelDBmemory-level
for volatile in-memory storage using red-black treesrocksdb
for persistent storage using RocksDB- waiting for the
rocks-level
package to be published
- waiting for the
Except for those related to the RDF/JS stream interfaces, quadstore
's
API is promise-based and all methods return objects that include both the actual
query results and the relevant metadata.
Objects returned by quadstore
's APIs have the type
property set to one of
the following values:
"VOID"
- when there's no data returned by the database, such as with theput
method;"QUADS"
- when a query returns a collection of quads;"APPROXIMATE_SIZE"
- when a query returns an approximate count of how many matching items are present.
For those methods that return objects with the type
property set to
"QUADS"
, quadstore
provides query results either in streaming mode or in
non-streaming mode.
Streaming methods such as getStream
return objects with the iterator
property set to an instance of AsyncIterator
, an implementation of a
subset of the stream.Readable
interface.
Non-streaming methods such as get
return objects with the items
property
set to an array of quads.
Quads are returned as and expected to be instances of the
RDF/JS Quad
interface as produced by the implementation of the
RDF/JS DataFactory
interface passed to the Quadstore
constructor.
Matching patterns, such as those used in the get
and getStream
methods,
are expected to be maps of term names to instances of the
RDF/JS Term interface.
The backend of a quadstore
can be accessed with the db
property, to perform
additional storage operations independently of quads.
In order to perform write operations atomically with quad storage, the put
,
multiPut
, del
, multiDel
, patch
and multiPatch
methods accept a
preWrite
option which defines a procedure to augment the batch, as in the
following example:
await store.put(dataFactory.quad(/* ... */), {
preWrite: batch => batch.put('my.key', Buffer.from('my.value'))
});
const Quadstore = require('quadstore').Quadstore;
const store = new Quadstore(opts);
Instantiates a new store. Supported properties for the opts
argument
are:
The opts.backend
option must be an instance of a leveldb backend.
See storage backends.
The dataFactory
option must be an implementation of the
RDF/JS DataFactory interface. Some of the available
implementations:
The opts.indexes
option allows users to configure which indexes will be used
by the store. If not set, the store will default to the following indexes:
[
['subject', 'predicate', 'object', 'graph'],
['object', 'graph', 'subject', 'predicate'],
['graph', 'subject', 'predicate', 'object'],
['object', 'subject', 'predicate', 'graph'],
['predicate', 'object', 'graph', 'subject'],
['graph', 'predicate', 'object', 'subject'],
];
This option, if present, must be set to an array of term arrays, each of
which must represent one of the 24 possible permutations of the four terms
subject
, predicate
, object
and graph
. Partial indexes are not
supported.
The store will automatically select which index(es) to use for a given query based on the available indexes and the query itself. If no suitable index is found for a given query, the store will throw an error.
Also, Quadstore
can be configured with a prefixes
object that defines a
reversible mapping of IRIs to abbreviated forms, with the intention of reducing
the storage cost where common HTTP prefixes are known in advance.
The prefixes
object defines a bijection using two functions expandTerm
and
compactIri
, both of which take a string parameter and return a string, as in
the following example:
opts.prefixes = {
expandTerm: term => term.replace(/^ex:/, 'http://example.com/'),
compactIri: iri => iri.replace(/^http:\/\/example\.com\//, 'ex:'),
}
This will replace the IRI http://example.com/a
with ex:a
in storage.
This method opens the store and throws if the open operation fails for any reason.
This method closes the store and throws if the open operation fails for any reason.
const pattern = {graph: dataFactory.namedNode('ex://g')};
const { items } = await store.get(pattern);
Returns an array of all quads within the store matching the specified terms.
This method also accepts an optional opts
parameter with the following
optional properties:
opts.order
: array of term names (e.g.['object']
) that represents the desired ordering criteria of returned quads. Equivalent to theORDER BY
clause inSQL
.opts.reverse
: boolean value that indicates whether to return quads in ascending or descending order. Equivalent toASC
/DESC
modifiers inSQL
.opts.limit
: limit the number of returned quads to the specified value. Equivalent toLIMIT
clause inSQL
.
quadstore
supports range-based matching in addition to value-based matching.
Ranges can be defined using the gt
, gte
, lt
, lte
properties:
const pattern = {
object: {
termType: 'Range',
gt: dataFactory.literal('7', 'http://www.w3.org/2001/XMLSchema#integer')
}
};
const { items } = await store.get(matchTerms);
Values for literal terms with the following numeric datatypes are matched against their numerical values rather than their literal representations:
http://www.w3.org/2001/XMLSchema#integer
http://www.w3.org/2001/XMLSchema#decimal
http://www.w3.org/2001/XMLSchema#double
http://www.w3.org/2001/XMLSchema#nonPositiveInteger
http://www.w3.org/2001/XMLSchema#negativeInteger
http://www.w3.org/2001/XMLSchema#long
http://www.w3.org/2001/XMLSchema#int
http://www.w3.org/2001/XMLSchema#short
http://www.w3.org/2001/XMLSchema#byte
http://www.w3.org/2001/XMLSchema#nonNegativeInteger
http://www.w3.org/2001/XMLSchema#unsignedLong
http://www.w3.org/2001/XMLSchema#unsignedInt
http://www.w3.org/2001/XMLSchema#unsignedShort
http://www.w3.org/2001/XMLSchema#unsignedByte
http://www.w3.org/2001/XMLSchema#positiveInteger
This is also the case for terms with the following date/time datatypes:
http://www.w3.org/2001/XMLSchema#dateTime
await store.put(dataFactory.quad(/* ... */));
Stores a new quad. Does not throw or return an error if the quad already exists.
This method also accepts an optional opts
parameter with the following
properties:
opts.preWrite
: this can be set to a function which accepts a chainedBatch and performs additional backend operations atomically with theput
operation. See Access to the backend for more information.opts.scope
: this can be set to aScope
instance as returned byinitScope()
andloadScope()
. If set, blank node labels will be changed to prevent blank node collisions. See Blank nodes and quad scoping.
await store.multiPut([
dataFactory.quad(/* ... */),
dataFactory.quad(/* ... */),
]);
Stores new quads. Does not throw or return an error if quads already exists.
This method also accepts an optional opts
parameter with the following
properties:
opts.preWrite
: this can be set to a function which accepts a chainedBatch and performs additional backend operations atomically with theput
operation. See Access to the backend for more information.opts.scope
: this can be set to aScope
instance as returned byinitScope()
andloadScope()
. If set, blank node labels will be changed to prevent blank node collisions. See Blank nodes and quad scoping.
This method deletes a single quad. It Does not throw or return an error if the specified quad is not present in the store.
await store.del(dataFactory.quad(/* ... */));
This method also accepts an optional opts
parameter with the following
properties:
opts.preWrite
: this can be set to a function which accepts a chainedBatch and performs additional backend operations atomically with theput
operation. See Access to the backend for more information.
This method deletes multiple quads. It Does not throw or return an error if the specified quads are not present in the store.
await store.multiDel([
dataFactory.quad(/* ... */),
dataFactory.quad(/* ... */),
]);
This method also accepts an optional opts
parameter with the following
properties:
opts.preWrite
: this can be set to a function which accepts a chainedBatch and performs additional backend operations atomically with theput
operation. See Access to the backend for more information.
This method deletes one quad and inserts another quad in a single operation. It Does not throw or return an error if the specified quads are not present in the store (delete) or already present in the store (update).
await store.patch(
dataFactory.quad(/* ... */), // will be deleted
dataFactory.quad(/* ... */), // will be inserted
);
This method also accepts an optional opts
parameter with the following
properties:
opts.preWrite
: this can be set to a function which accepts a chainedBatch and performs additional backend operations atomically with theput
operation. See Access to the backend for more information.
This method deletes and inserts quads in a single operation. It Does not throw or return an error if the specified quads are not present in the store (delete) or already present in the store (update).
// will be deleted
const oldQuads = [
dataFactory.quad(/* ... */),
dataFactory.quad(/* ... */),
];
// will be inserted
const newQuads = [ // will be inserted
dataFactory.quad(/* ... */),
dataFactory.quad(/* ... */),
dataFactory.quad(/* ... */),
];
await store.multiPatch(oldQuads, newQuads);
This method also accepts an optional opts
parameter with the following
properties:
opts.preWrite
: this can be set to a function which accepts a chainedBatch and performs additional backend operations atomically with theput
operation. See Access to the backend for more information.
const pattern = {graph: dataFactory.namedNode('ex://g')};
const { iterator } = await store.getStream(pattern);
Just as QuadStore.prototype.get(), this method
supports range matching and the order
, reverse
and
limit
options.
await store.putStream(readableStream);
Imports all quads coming through the specified stream.Readable
into the store.
This method also accepts an optional opts
parameter with the following
properties:
opts.scope
: this can be set to aScope
instance as returned byinitScope()
andloadScope()
. If set, blank node labels will be changed to prevent blank node collisions. See Blank nodes and quad scoping.
await store.delStream(readableStream);
Deletes all quads coming through the specified stream.Readable
from the store.
const subject = dataFactory.namedNode('http://example.com/subject');
const graph = dataFactory.namedNode('http://example.com/graph');
store.match(subject, null, null, graph)
.on('error', (err) => {})
.on('data', (quad) => {
// Quad is produced using dataFactory.quad()
})
.on('end', () => {});
Implementation of the RDF/JS Source#match method. Supports range-based matching.
const readableStream; // A stream.Readable of Quad() instances
store.import(readableStream)
.on('error', (err) => {})
.on('end', () => {});
Implementation of the RDF/JS Sink#import method.
const readableStream; // A stream.Readable of Quad() instances
store.remove(readableStream)
.on('error', (err) => {})
.on('end', () => {});
Implementation of the RDF/JS Store#remove method.
const subject = dataFactory.namedNode('http://example.com/subject');
const graph = dataFactory.namedNode('http://example.com/graph');
store.removeMatches(subject, null, null, graph)
.on('error', (err) => {})
.on('end', () => {});
Implementation of the RDF/JS Sink#removeMatches method.
Blank nodes are defined as existential variables in that they merely indicate the existence of an entity rather than act as references to the entity itself.
While the semantics of blank nodes can be rather confusing, one of the most practical consequences of their definition is that two blank nodes having the same label may not refer to the same entity unless both nodes come from the same logical set of quads.
As an example, here's two JSON-LD documents converted to N-Quads using the
JSON-LD playground:
{
"@id": "http://example.com/bob",
"foaf:knows": {
"foaf:name": "Alice"
}
}
<http://example.com/bob> <foaf:knows> _:b0 .
_:b0 <foaf:name> "Alice" .
{
"@id": "http://example.com/alice",
"foaf:knows": {
"foaf:name": "Bob"
}
}
<http://example.com/alice> <foaf:knows> _:b0 .
_:b0 <foaf:name> "Bob" .
The N-Quads equivalent for both of these documents contains a blank node with
the b0
label. However, although the label is the same, these blank nodes
indicate the existence of two different entities. Intuitively, we can say that
a blank node is scoped to the logical grouping of quads that contains it, be it
a single quad, a document or a stream.
As quadstore treats all write operations as if they were happening within the same scope, importing these two sets of quads would result in a collision of two unrelated blank nodes, leading to a corrupted dataset.
A good way to address these issues is to skolemize skolemize all blank nodes into IRIs / named nodes. However, this is not always possible and / or practical.
The initScope()
method returns a Scope
instance which can be passed to the put
, multiPut
and putStream
methods.
When doing so, quadstore will replace each occurrence of a given blank node
with a different blank node having a randomly-generated label, preventing blank
node collisions.
Each Scope
instance keeps an internal cache of mappings between previously
encountered blank nodes and their replacements, so that it is able to always
return the same replacement blank node for a given label. Each new mapping is
atomically persisted to the store together with its originating quad, leading
each scope to be incrementally persisted to the store consistently with each
successful put
and multiPut
operation. This allows scopes to be re-used
even across process restarts via the
loadScope()
method.
Initializes a new, empty scope.
const scope = await store.initScope();
await store.put(quad, { scope });
await store.multiPut(quads, { scope });
await store.putStream(stream, { scope });
Each Scope
instance has an .id
property that acts as its unique identifier.
The loadScope()
method can be used to re-hydrate a scope through its .id
:
const scope = await store.initScope();
/* store scope.id somewhere */
/* read the previously-stored scope.id */
const scope = await store.loadScope(scopeId);
Deletes all mappings of a given scope from the store.
const scope = await store.initScope();
/* ... */
await store.deleteScope(scope.id);
Deletes all mappings of all scopes from the store.
await store.deleteAllScopes();
SPARQL queries can be executed against a Quadstore
instance using any query
engine capable of querying across RDF/JS data sources.
An example of one such engine is quadstore-comunica, an engine built as a custom distribution and configuration of Comunica that implements the RDF/JS Query spec.:
Comunica is a knowledge graph querying framework. [...] Comunica is a meta query engine using which query engines can be created. It does this by providing a set of modules that can be wired together in a flexible manner. [...] Its primary goal is executing SPARQL queries over one or more interfaces.
In time, quadstore-comunica will be extended with custom query modules that will optimize query performance by pushing some matching and ordering operations down to quadstore itself.
import memdown from 'memdown';
import {DataFactory} from 'rdf-data-factory';
import {Quadstore} from 'quadstore';
import {Engine} from 'quadstore-comunica';
const backend = memdown();
const df = new DataFactory();
const store = new Quadstore({backend, dataFactory: df});
const engine = new Engine(store);
const query = await engine.query('SELECT * {?s ?p ?o}');
const bindingsStream = await query.execute();
More information on quadstore-comunica's repository.
The browser-level
backend for levelDB offers support for browser-side
persistent storage.
quadstore
can be bundled for browser-side usage via Webpack, preferably using
version 5.x. The reference quadstore-browser is meant to help in getting
to a working Webpack configuration and also hosts a pre-built bundle with everything
that is required to use quadstore
in browsers.
quadstore
can be used with the Deno runtime via the skypack.dev
CDN:
import { DataFactory } from 'https://cdn.skypack.dev/rdf-data-factory@1.1.1';
import { Quadstore } from 'https://cdn.skypack.dev/quadstore@11.0.0';
import { MemoryLevel } from 'https://cdn.skypack.dev/memory-level@1.0.0';
import { Engine } from 'https://cdn.skypack.dev/quadstore-comunica@3.0.0';
const backend = new MemoryLevel();
const dataFactory = new DataFactory();
const store = new Quadstore({ backend, dataFactory });
const engine = new Engine(store);
await store.open();
await store.put(dataFactory.quad(
dataFactory.namedNode('ex://s'),
dataFactory.namedNode('ex://p'),
dataFactory.namedNode('ex://o'),
));
const stream = await engine.queryBindings('SELECT * WHERE { ?s ?p ?o }');
stream.on('data', (bindings) => console.log(bindings));
Due to an upstream issue with the SPARQL parser, the following import map must
be used. This replaces Skypack's own version of sparqljs@3.5.2
with one that
is hosted on gist.github.com
and is identical to the former if not for a fix
to an unchecked use of require
that can't be easily merged upstream.
{
"imports": {
"https://cdn.skypack.dev/-/sparqljs@v3.5.2-dsMDqK77bLuGqQk32ifA/dist=es2019,mode=imports/optimized/sparqljs.js": "https://gist.githubusercontent.com/jacoscaz/022c513ca77b0061c5bfee0356ba3b8d/raw/95bb09057fbad4daace3684c80b1164b38725c7c/sparql.js-skypack-require-fix.js"
}
}
Example usage:
deno run --import-map quadstore-import-map.json quadstore-test.ts
Performance is evaluated at tracked at https://github.com/belayeng/quadstore-perf
MIT. See LICENSE.md.