Skip to content

Latest commit

 

History

History
369 lines (289 loc) · 13.2 KB

2014-02-25-dependent-types-and-plhaskell.md

File metadata and controls

369 lines (289 loc) · 13.2 KB
title
Interpreting Haskell Inside PostgreSQL With Singleton Types

I am currently in the early stages of a new project, plhaskell. plhaskell is a language handler for PostgreSQL that will allow you to write PostgreSQL functions in Haskell. The project has already taught me a lot, but recently I've had a somewhat mind bending encounter with what is almost dependently typed programming in Haskell - and I've just got to share it with you.

Here's the problem. At runtime, we have a string containing Haskell code, metadata about the function call which includes the function signature, and a list of argument values. Is it possible to interpret this string as a Haskell function and call it with the provided arguments?

Let's start by interpreting the string. The hint library provides a high-level interface to GHC's API, which allows us to interpret strings as values of a given Haskell type. The most useful function for us is:

interpret :: Typeable a => String -> a -> m a

Given a String containing Haskell code and an expected type, interpret will attempt to construct a value of that type by interpreting the given string. We can run the interpreter using runInterpreter. For example:

> runInterpreter $ setImports ["Prelude"] >> interpret "True" (as :: Bool)
Right True

> runInterpreter $ setImports ["Prelude"] >> interpret "True" (as :: Int)
Left (WontCompile [GhcError {errMsg = "
Couldn't match expected type `GHC.Types.Int'
            with actual type `GHC.Types.Bool'"}])

We're making good progress, but now we hit a problem. Notice that we are deciding the type that interpret should return statically, at compile time. The problem statement above indicates that the type information isn't known at compile time, we only learn about that at runtime. At first this seems impossible - after all, Haskell is all about static types!

The dependent types crowd are laughing at this point. Surely we just want to write a function with this type:

interpret :: String -> (sig : Signature) -> (FunctionType sig)

Alas, we can't write this in Haskell. Or can we?

A Type of Types

A good first step is to clear up what we mean by "types". PostgreSQL has an open universe of types, as types can be defined at runtime. However, we'll keep things a little simpler by considering a closed universe of the following types:

data PgType = PgInt | PgString

Combining GHC's new(ish) abilities to promote data types along with type families, we can form a mapping from PostgreSQL types to Haskell types:

type family InterpretType (t :: PgType) :: *
type instance InterpretType 'PgInt = Int
type instance InterpretType 'PgString = String

The type family takes types of kind PgType to types of kind * - the kind of types of Haskell values. It's important to remember that this is all at the type level - we transform the type PgInt to the type Int. Working at the type level is not enough to solve the overarching problem, because runtime information is at the value level. In order to link types and values, we can use singleton types.

Intermediate Haskell programmers should be familiar with phantom types - type variables whose sole purpose is to carry type information. Singleton types work in a similar way - we add a type parameter to carry information up from values to their types. For PgType, there is a corresponding singleton type:

data SPgType :: PgType -> * where
  SPgInt :: SPgType 'PgInt
  SPgString :: SPgType 'PgString

By using a GADT, we can now learn more information about our types by pattern matching. For example, we can write a function who's return type depends on the value of the first parameter:

exampleValues :: SPgType t -> InterpretType t
exampleValues SPgInt = 42
exampleValues SPgString = "Forty two"

At first glance this looks like nonsense, but you should take time to convince yourself that it makes sense. If you give exampleValues a SPgInt, then by pattern matching on that we learn that the t in the type signature is actually a 'PgInt. The type family InterpretType maps PgInt to Int, thus 42 is a perfectly valid return type. Similar reasoning shows that "Forty two" is equally valid, provided we have evidence that t ~ 'PgString (this is notation to indicate type equality).

It may help to play around with this in GHCI:

> :t exampleValues SPgString
exampleValues SPgString :: InterpretType 'PgString

> :t exampleValues SPgInt
exampleValues SPgInt :: InterpretType 'PgInt

> :kind! InterpretType 'PgInt
InterpretType 'PgInt :: *
= Int

From Types to Function Arrows

So far, we have only considered working with single types. However, a function contains multiple types - one for each argument and another for the return type. We need to introduce a data type to capture this. We make the observation that any function has at least one type, so we introduce a variant of non-empty lists:

data PgFunction
  = PgReturn PgType
  | PgArrow PgType PgFunction

This data type is also amenable to promotion to the type level, so we can map entire PgFunctions to their corresponding Haskell function type:

type family InterpretFunction (t :: PgFunction) :: *
type instance InterpretFunction ('PgReturn t) = InterpretType t
type instance InterpretFunction ('PgArrow t ts) = InterpretType t -> InterpretFunction ts

This type family is essentially folding PgFunction down to a function type - we replace PgArrow with ->, and all PgTypes with their interpretation.

Not only can we do promotion to the type level, we can play a similar trick with singleton types:

data SPgFunction :: PgFunction -> * where
  SPgReturn :: SPgType t -> SPgFunction ('PgReturn t)
  SPgArrow :: SPgType t -> SPgFunction ts -> SPgFunction ('PgArrow t ts)

This singleton type has singleton constructors that make use of the singleton types for PgType. To try and get a better handle on what this gives us, here's a similar set of examples, but this time for functions:

exampleFunctions :: SPgFunction t -> InterpretFunction t
exampleFunctions (SPgReturn SPgInt) = 42
exampleFunctions (SPgArrow SPgString (SPgReturn SPgInt)) = length
> exampleFunctions (SPgReturn SPgInt)
42

> exampleFunctions (SPgArrow SPgString (SPgReturn SPgInt)) "I <3 Types"
10

The singletons Library

Hopefully you now have a rudimentary understanding of singleton types. Writing out singleton types by hand is both tedious and error prone, especially when the translation is purely mechanical. Richard Eisenberg is the author of the singletons library, which contains Template Haskell to do this for us. We'll now switch to the following source code:

$(singletons [d|
  data PgType = PgInt | PgString

  data PgFunction = PgReturn PgType | PgArrow PgType PgFunction
  |])

-- These are exactly as before
type family InterpretType (t :: PgType) :: *
type family InterpretFunction (t :: PgFunction) :: *

Call Yourself Typeable? Prove it!

We're a good part of the way towards our goal now. To recap, we have defined a type of PostgreSQL types, and we can use these types to represent function signatures. Singleton types let us work at both the value and the type level, and type families let us turn these PostgreSQL types into their corresponding Haskell types.

Let's recap the type of interpret:

interpret :: Typeable a => String -> a -> m a

It would appear we have all the machinery we need now to construct that type a. Given a SPgFunction we can use InterpretFunction to decide the return type, and we can use the magic undefined value to provide a value. (Really we should be using Proxy, but we have to work with what hint wants):

funType :: SPgFunction t -> InterpretFunction t
funType (SPgReturn _) = undefined
funType (SPgArrow _ ts) = \_ -> funType ts

If we have SPgReturn then we use undefined as our value. Otherwise, we need to form a function, which we can do by simply ignoring the argument and recursing with funType.

So far we've been working with singleton types, but we won't be constructing these directly at runtime - rather we'll be constructing PgFunction values. It's easy to get a singleton out of this though, by using the withSomeSing combinator in the singletons library.

Brimming with excitement, we hack out a bit of code to try this all out!

firstAttempt :: String -> PgFunction -> IO ()
firstAttempt code sig = withSomeSing sig $ \s ->
  runInterpreter $ do
    setImports [("Prelude")]
    interpret code (funType s)

  putStrLn "It didn't crash!"
    No instance for (Typeable (InterpretFunction a))
      arising from a use of `interpret'
    Possible fix:
      add an instance declaration for (Typeable (InterpretFunction a))
    In a stmt of a 'do' block: interpret code (funType s)

Unfortunately, hint is requiring that there is a Typeable instance for a, but there is no evidence that our interpretation of PostgreSQL types as a Haskell function will by Typeable. Of course, we know it is - it's made up from String, Int and (->), all of which are Typeable. How do we prove to GHC that we really do have a Typeable instance?

Edward Kmett blogged about a useful trick in 2011 (more information on this at the Joy of Types) which is exactly what we need. If we define the following data type...

data Dict a where
  Dict :: a => Dict a

Then we get first class type instances! By having first class type instances, we can pass them around as values. Furthermore, pattern matching on them teaches us about our types - specifically teaching us that certain type class instances are present.

To get our feet wet, we can show that all interpretations of single PostgreSQL types are Typeable:

pgTypeTypeable :: SPgType t -> Dict (Typeable (InterpretType t))
pgTypeTypeable SPgInt = Dict
pgTypeTypeable SPgString = Dict

Admittedly, it's a rather queer function. However, we can make use of this as a lemma to build an inductive proof that all interpretations of PostgreSQL functions are Typeable. Our base case is that a function of no arguments is Typeable. Our inductive hypothesis is that a function of n arguments Typeable, and we can use this to show that a function of n+1 arguments is also Typeable:

pgFunTypeTypeable :: SPgFunction t -> Dict (Typeable (InterpretFunction t))
pgFunTypeTypeable (SPgReturn t) = pgTypeTypeable t
pgFunTypeTypeable (SPgArrow t ts) =
  case pgTypeTypeable t of
    Dict ->
      case pgFunTypeTypeable ts of
        Dict -> Dict

I found pgTypeTypeable mind-bending, but this is mind-bending in an extra dimension. However! It does actually do the job, as we're now ready to finally reach our goal, and the following function type checks:

goal :: String -> PgFunType -> IO ()
goal code signature = withSomeSing signature $ \s ->
  case pgFunTypeTypeable s of Dict ->
    f <- runInterpreter $ do
        setImports [("Prelude")]
        interpret code (funType s)

    case f of
      Left error ->
        print error

      Right f' ->
        -- Apply arguments to f'
        return ()

It's not a particularly useful goal, because it doesn't actually call f. To do so, we need a supply of arguments. The topic for this post was to demonstrate how to create the function type, so we'll just cheat here and use Data.Dynamic. (In plhaskell, we recurse through the SPgFunction and peek into pointers that PostgreSQL provides). See the full code listing for source code to the below REPL session:

> final "42" (PgReturn PgInt) []
42

> final "(+)" (PgArrow PgInt (PgArrow PgInt (PgReturn PgInt)))
>   [ toDyn (1 :: Int), toDyn (2 :: Int) ]
3

> final "\"Hello\"" (PgArrow PgInt (PgReturn PgInt)) []
WontCompile [GhcError {errMsg = "Couldn't match expected type `GHC.Types.Int -> GHC.Types.Int'
                                             with actual type `[GHC.Types.Char]'"}]

Voilà.

Conclusion & Acknowledgements

Dependent types are fantastic, but you don't have to switch languages to benefit from them. When combined with the singletons library, GHC 7.6 can do a significant amount of work that only seems possible in Agda or Idris. With all of that said, it's not without pain points. Other languages that are intentionally dependently typed don't need singletons, and a lot of this will feel a lot more natural.

I want to send a massive thank you out to Richard Eisenberg - once for his work on the singletons library and associated research, but also for providing a last bit of hand-holding and teaching me the trick with Dict to get Typeable working.

To close with, we can dispel one more myth:

ocharles> @faq Can I use Haskell with fake dependent types to interpret code with type
information at runtime?
lambdabot> The answer is: Yes! Haskell can do that.

I love this language.