-
Notifications
You must be signed in to change notification settings - Fork 1
RefTypes
One of the basic requirements for HyperGraph is the ability to store arbitrary types of data as atoms. Another requirement is language interoperability. We would like to be able to work with the same underlying graph database from different host languages. This means that actual data must be recorded in language neutral format and, in addition, typing information must be extensible and language neutral itself. Combined, those requirements translate to the following more concrete goals:
- Embed the typing system into HyperGraph itself so that new types can be created, manipulated and searched in a way similar to atoms.
- Build the foundations out of the very notion of a type so that nearly all the common typing constructs found in various languages could be represented in HyperGraph.
Thus we record typing information in the same storage mechanism as normal atoms. A big conceptual question is whether types should really be considered HyperGraph atoms, even at the API level. I think they should for reasons that will become apparent in the following lines.
The notion of a type in computer science draws from the set theoretic type notion, but it has taken a life of its own to become a primitive, undefined concept as that of a set in mathematics. In computational type theory, the notion of type is formalized by saying what can be done with them, how to distinguish different types etc., very much like with sets in set theory. Whereas in sets, the most elementary notion is that of membership, in type theory the defining notion is that of equality: when are two elements of a given type equal? Then come notions of compounding (constructing) new types out of simpler ones, subtyping and the like.
In programming languages, one thinks of types in terms of computational constraints imposed on the data (e.g. the range of values a given data variable may assume). Nearly every typed language (especially statically typed ones) offer a type system based on a set of primitive types and means to build new ones out of the primitives. The complexity of a type system generally reflects the number of ways one can construct new types (union, intersection, functions, records, inheritance) and the elements out of which those new types can be constructed. In polymorphic languages for instance, one is allowed to extend the means of building new types. Types can be parameterized over other types so that we end up with “type constructor” entities whose instances are concrete types much as data values are instances of types. We then have a set of predefined constructors: the one for records, the one for functions and perhaps some others found in functional languages, but we also essentially have means to define “custom constructors”. In general, neither programming languages, nor formal treatments go beyond this meta-level of type constructors to, for example, type-constructor-constructors. Mathematically, however there is no reason to stop at level 2, and not go to a potentially infinite tower of types.
Now, sub-typing seems to be usually always analyzed in an add-hoc manner in formal treatments of programming languages. First, there is the nice mathematical type formalism; type checking rules and type inference are developed. And then sub-typing is stipulated with special rules for each separate type constructor: for records, the sub-type relationship means something, for functions it means something else. The general rule is always tied to the idea of substitutability.
In sum, I don’t think there is a clear cut, definitive formal notion of a type that could be easily implemented in HyperGraph as is, without being bound a particular programming language or type system. The French logician Jean-Yves Girard very rightly points out that types can be seen as plugging instructions. A term of a given type T is both something that can be plugged somewhere as well as a plug with free, typed variables. This is an operational view very much in spirit with Russell’s original type theory whose motivation was that of formal (i.e. syntactic) constructability. I think this is as far as one can go in unifying what types really are.
What to make of all these consideration for HyperGraph types? Evidently, fixing a type system would always miss some future application requirement, be it Novamente or an application written in a particular language, because it will be based on arbitrary design choices. And the main purpose of HyperGraphDB is to provide basic infrastructural capabilities for a hypergraph, typed data structure, but nothing more. Therefore, a very general and minimal approach is adopted for the foundation of HyperGraph type system, together with predefined facilities build on top of it. The rules are the following:
- Every HyperGraph atom holds a typed value.
- The type of an atom is responsible for managing the value instance in the HyperGraph storage and mapping it to a run-time instance in the host language.
- The type of an atom defines a minimal semantic relationship between value instances whereby knowledge of whether a value can be used in the place of another is provided.
There is no distinction between types, type constructors, type constructor-constructors etc. Only between types and values where the values may be types themselves managing values at the lower level and so forth. At the bottom, there are primitive types and primitive values such as strings, integers, booleans and reals. There are also predefined higher-level types (or, equivalently, type constructors). All values as stored as HyperGraph atoms. Thus the value of a type constructor for records is a concrete record type whose values are concrete records.
All types implement the HGAtomType interface. That interface defines the four standard database management methods for retrieving, updating, inserting and removing a value from storage. The implementations of those methods make use of the basic HyperGraph storage facilities as detailed in the following sections. The fifth method of the HGAtomType interface – subsumes(Object A, Object B) – implements the semantic predicate mentioned in rule 3 above. This predicate is the basis for the implementation of sub-typing and of equality between typed values in general. It is true iff B can be used in place of A in a computation.