The OCaml-TLS architecture

The OCaml ecosystem has several distinct ways of interacting with the outside world (and the network in particular): straightforward unix interfaces and the asynchronous programming libraries lwt and async. One of the early considerations was not to restrict ourselves to any of those -- we wanted to support them all.

There were also two distinct basic "platforms" we wanted to target from the outset: the case of a simple executable, and the case of Mirage unikernels.

So one of the first questions we faced was deciding how to represent interactions with the network in a portable way. This can be done by systematically abstracting out the API boundary which gives access to network operations, but we had a third thing in mind as well: we wanted to exploit the functional nature of OCaml to its fullest extent!

Our various prior experiences with Haskell and Idris convinced us to adopt what is called "purely functional" technique. We believe it to be an approach which first forces the programmer to give principled answers to all the difficult design questions (errors and global data-flow) in advance, and then leads to far cleaner and composable code later on. A purely functional system has all the data paths made completely explicit in the form of function arguments and results. There are no unaccounted-for interactions between components mediated by shared state, and all the activity of the parts of the system is exposed through types since, after all, it's only about computing values from values.

For these reasons, the library is split into two parts: the directory /lib (and the corresponding findlib package tls) contains the core TLS logic, and /mirage and /lwt (packaged as tls.mirage and tls.lwt respectively) contain front-ends that tie the core to Mirage and Lwt_unix.

Core

The core library is purely functional. A TLS session is represented by the abstract type Tls.Engine.state, and various functions consume this session type together with raw bytes (Cstruct.t -- which is by itself mutable, but ocaml-tls eschews this) and produce new session values and resulting buffers.

The central entry point is handle_tls, which transforms an input state and a buffer to an output state, a (possibly empty) buffer to send to the communication partner, and an optional buffer of data intended to be received by the application:

type state

type ret = [
  | `Ok of [ `Ok of state | `Eof | `Alert of alert ] *
      [ `Response of Cstruct.t ] * [ `Data of Cstruct.t option ]
  | `Fail of alert * [ `Response of Cstruct.t ]
]

val handle_tls : state -> Cstruct.t -> ret

As the signature shows, errors are signalled through the ret type, which is a polymorphic variant. This reflects the actual internal structure: all the errors are represented as values, and operations are composed using an error monad.

Other entry points share the same basic behaviour: they transform the prior state and input bytes into the later state and output bytes.

Here's a rough outline of what happens in handle_tls:

TLS packets consist of a header, which contains the protocol version, length, and content type, and the payload of the given content type. Once inside our main handler, we separate the buffer into TLS records, and process each individually. We first check that the version number is correct, then decrypt, and verify the mac.
Decrypted data is then dispatched to one of four sub-protocol handlers (Handshake, Change Cipher Spec, Alert and Application Data). Each handler can return a new handshake state, outgoing data, application data, the new decryption state or an error (with the outgoing data being an interleaved list of buffers and new encryption states).
The outgoing buffers and the encryption states are traversed to produce the final output to be sent to the communication partner, and the final encryption, decryption and handshake states are combined into a new overall state which is returned to the caller.

Handshake is (by far) the most complex TLS sub-protocol, with an elaborate state machine. Our client and server encode this state as a "flat" sum type, with exactly one incoming message allowed per state. The handlers first parse the handshake packet (which fails in case of malformed or unknown data) and then dispatch it to the handling function. The handshake state is carried around and a fresh one is returned from the handler in case it needs updates. It consists of a protocol version, the handshake state, configuration, renegotiation data, and possibly a handshake fragment.

Logic of both handshake handlers is very localised, and does not mutate any global data structures.

Core API

OCaml permits the implementation a module to be exported via a more abstract signature that hides the internal representation details. Our public API for the core library consists of the Tls.Engine and Tls.Config modules.

Tls.Engine contains the basic reactive function handle_tls, mentioned above, which processes incoming data and optionally produces a response, together with several operations that allow one to initiate message transfer like send_application_data (which processes application-level messages for sending), send_close_notify (for sending the ending message) and reneg (which initiates full TLS renegotiation).

The module also contains the only two ways to obtain the initial state:

val client : Config.client -> (state * Cstruct.t)
val server : Config.server -> state

That is, one needs a configuration value to create it. The Cstruct.t that client emits is the initial Client Hello since in TLS, the client starts the session.

Tls.Config synthesizes configurations, separately for client and server endpoints, through the functions client_exn and server_exn. They take a number of parameters that define a TLS session, check them for consistency, and return the sanitized config value which can be used to create a state and, thus, a session. If the check fails, they raise an exception.

The parameters include the pair of a certificate and its private key for the server, and an X509.Authenticator.t for the client, both produced by our ocaml-x509 library and described in a [previous article][x509-intro].

This design reflects our attempts to make the API as close to "fire and forget" as we could, given the complexity of TLS: we wanted the library to be relatively straightforward to use, have a minimal API footprint and, above all, fail very early and very loudly when misconfigured.

Effectful front-ends

Clearly, reading and writing network data does change the state of the world. Having a pure value describing the state of a TLS session is not really useful once we write something onto the network; it is certainly not the case that we can use more than one distinct state to process further data, as only one value is in sync with the other endpoint at any given time.

Therefore we wrap the core types into stateful structures loosely inspired by sockets and provide IO operations on those. The structures of mirage and lwt front-ends mirror one another.

In both cases, the structure is pull-based in the sense that no processing is done until the client requires a read, as opposed to a callback-driven design where the client registers a callback and the library starts spinning in a listening loop and invoking it as soon as there is data to be processed. We do this because in an asynchronous context, it is easy to create a callback-driven interface from a demand-driven one, but the opposite is possible only with unbounded buffering of incoming data.

One exception to demand-driven design is the initial session creation: the library will only yield the connection after the first handshake is over, ensuring the invariant that it is impossible to interact with a connection if it hasn't already been fully established.

Mirage

The Mirage interface matches the FLOW signature (with additional TLS-specific operations). We provide a functor that needs to be applied to an underlying TCP module, to obtain a TLS transport on top. For example:

module Server (Stack: STACKV4) (KV: KV_RO) =
struct

  module TLS  = Tls_mirage.Make (Stack.TCPV4)
  module X509 = Tls_mirage.X509 (KV) (Clock)

  let accept conf flow =
    TLS.server_of_tcp_flow conf flow >>= function
    | `Ok tls ->
      TLS.read tls >>= function
      | `Ok buf ->
        TLS.write tls buf >>= fun () -> TLS.close buf

  let start stack e kv =
    lwt authenticator = X509.authenticator kv `Default in
    let conf          = Tls.Config.server_exn ~authenticator () in
    Stack.listen_tcpv4 stack 4433 (accept conf) ;
    Stack.listen stack

end

Lwt

The lwt interface has two layers. Tls_lwt.Unix is loosely based on read/write operations from Lwt_unix and provides in-place update of buffers. read, for example, takes a Cstruct.t to write into and returns the number of bytes read. The surrounding module, Tls_lwt, provides a simpler, Lwt_io-compatible API built on top:

let main host port =
  lwt authenticator = X509_lwt.authenticator (`Ca_dir nss_trusted_ca_dir) in
  lwt (ic, oc)      = Tls_lwt.connect ~authenticator (host, port) in
  let req = String.concat "\r\n" [
    "GET / HTTP/1.1" ; "Host: " ^ host ; "Connection: close" ; "" ; ""
  ] in
  Lwt_io.(write oc req >>= fun () -> read ic >>= print)

We have further plans to provide wrappers for Async and plain Unix in a similar vein.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design.md

design.md

The OCaml-TLS architecture

Core

Core API

Effectful front-ends

Files

design.md

Latest commit

History

design.md

File metadata and controls

The OCaml-TLS architecture

Core

Core API

Effectful front-ends