STATUS: PARTIAL
This module provides very basic support for hashing and time-stamping data onto the blockchain. It serves the purpose of providing a way that any piece of data can be tracked onto the blockchain for a fee and known to have existed at or before some given block-height. It is intended that this module will mostly be supplanted by other more domain specific functionality and/or enhanced with robust, opt-in schema-validation support in the future.
- It should be possible to store arbitrary data on the blockchain for a fee
- It should be possible to track arbitrary off-chain data by hash on the blockchain, thus generating a proof of timestamp
- On-chain and off-chain data should be available in the index available to oracles
- There must be a robust way for dealing with hash collisions, especially with respect to off-chain data whose content is opaque
type MsgStoreGraph struct {
// RDF graph data in N-Triples text format with no blank nodes allowed!
NTriples string `json:"ntriples"`
// Expected hash of the graph. The transaction will be rejected if this hash can't be verified.
URDNA2015_BLAKE2B_256_Hash []byte `json:"urdna2015_blake2b_256_hash"`
Signer sdk.AccAddress `json:"signer"`
}
N-Triples format has been chosen as a starting point because it is easy to parse and self-contained. **Blank nodes are not allowed in on-chain graphs!** This restriction makes it easy to verify that the dataset is canonicalized and that the hash matches, without having to run the full canonicalization algorithm on-chain. The N-Triples data passed in must be in canonicalized form which essentially means that it is sorted because blank nodes are not allowed.
NOTE The reason JSON-LD has not been chosen for on-chain usage is that the way `@context` is designed explicitly requires JSON-LD processors to pull off-chain HTTP data which is indeterministic.
It might be useful to track format on-chain but not verify it. For a given format there could be multiple schemas that it satisfies. My current thoughts are that this is a type of verification/validation that can be done off chain and there can be on-chain attestations about that - ARC.
type MsgTrackDataset struct {
URDNA2015_BLAKE2B_256_Hash []byte `json:"urdna2015_blake2b_256_hash"`
Url string `json:"url,omitempty"`
Signer sdk.AccAddress `json:"signer"`
}
should data stores that reference off-chain data have their own on-chain reference and data tracking instead of a URL just reference the service via which it can be retrieved by hash?
i.e. if we know the service ID we can just do an HTTP GET for <service-base-uri>/<hash>
.
type HashAlgorithm int
const (
BLAKE2B_256 HashAlgorithm = 0
SHA256 HashAlgorithm = 1
)
type MsgTrackData struct {
Hash []byte `json:"hash"`
Algorithm HashAlgorithm `json:"algorithm"`
Url string `json:"url,omitempty"`
Signer sdk.AccAddress `json:"signer"`
}
This is a use case we may want to support but for now are not supporting it because it is questionable whether we should encourage storing data on-chain that can’t be interpreted by other on-chain infrastructure.
allow for multiple URL’s to be provided for off-chain data and to allow possible ways to deal with hash collisions
This should probably be coordinated with the IBC spec
On-chain graphs are identified by the URI formed by encoding the URNDNA2015_BLAKE2B_256 hash of the graph with the prefix xrn://<block-number>/g/
.
Off-chain datasets are identified by the URI formed by encoding the URNDNA2015_BLAKE2B_256 hash of the dataset with the prefix xrn://<block-number>/ds/
.
Off-chain raw data is identified by the URI formed by encoding the Blake2b 256-bit hash of the data prefixed with xrn://<block-number>/dt/
.
On-chain raw data is identified by the URI formed by encoding the Blake2b 256-bit hash of the data prefixed with xrn://<block-number>/da/
.
CREATE TABLE "data" (
uri text NOT NULL PRIMARY KEY,
tx bytea NOT NULL REFERENCES tx,
graph jsonb
--raw_data bytea
);
COMMENT ON COLUMN graph.graph IS 'The JSON-LD expanded form representation of an on-chain graph';
COMMENT ON COLUMN graph.raw_data IS 'Raw data bytes for on-chain raw data';
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX xrn: <http://regen.network/schema#>
xrn:urdna2015Blake2b256Hash a rdf:Property ;
rdfs:range xsd:base64Binary .
On chain graphs are indexed in the RDF store in the named graph identified with the graph identifier URI. They are annotated in the default graph as follows (where xrn://12345/g/1xq52sutm
is an example graph URI):
PREFIX xrn: <http://regen.network/schema#>
<xrn://12345/g/1xq52sutm>
xrn:tx <xrn://12345/tx/abcdef1234567> ;
xrn:urdna2015Blake2b256Hash "sdgbhABN38dsfgn23t=" .