Skip to content

Object Design And Schema

Chris Anderson edited this page Jul 30, 2013 · 12 revisions

These are the primary classes and SQL tables used by Couchbase Lite. I’m describing them in language-neutral terms since I expect there to be multiple implementations.

Database

The heart of Couchbase Lite is the Database class. On disk flash, it consists of a SQLite database file and an associated directory containing attachments. In memory it has:

  • a database connection handle
  • a set of View objects representing rows in the Views table
  • a set of Replicator objects representing active replication tasks

Table: “docs”

This table stores document ID strings so they can be represented more compactly as foreign keys in the “revs” table.

Column Type Description
doc_id integer Primary key
docid text Document ID string

Table: “revs”

Each row in this table is a revision of a document. It’s used to model the sequence of updates, so that replication can proceed from any point when it connects to a peer.

Column Type Description
sequence integer Sequence number (this is the primary key; it is set to auto-increment without reusing any values)
doc_id integer Document ID (foreign key)
revid text Revision ID string
parent integer Parent revision’s sequence number, or null if no parent (foreign key)
current boolean Is this a current (leaf) revision?
deleted boolean Does this revision represent a deletion?
json blob Document contents in UTF-8 encoded JSON

Note: To save space, the JSON does not include the `id`, `rev`, `deleted` or `attachments` properties; those are added when the JSON is returned from the API.

Table: “attachments”

Tracks attachments of revisions and their keys in the content-addressable BlobStore.

Column Type Description
sequence integer Revision that owns this attachment (foreign key)
filename text Filename of the attachment
key blob Contents’ key in attachment store (SHA-1 digest of contents)
type text MIME type
length integer Content length in bytes
encoding integer Type of encoding/compression (0 for none, 1 for gzip)
encoded_length integer Length of encoded data, if there’s an encoding
revpos integer Generation number (numeric revision prefix) where this attachment was added or changed

Every ‘revs’ row has associated ‘attachments’ rows for every attachment it contains, not just for attachments added or modified in that revision. This does mean a lot of duplicate ‘attachments’ rows, but it makes attachment lookup faster, and compaction easier.

Table: “views”

Each row in this table is a view definition.

Column Type Description
view_id integer Primary key
name text Name of view (unique)
version text Version ID of view definition function; must be changed if the function’s semantics change
lastsequence integer The last sequence number in “docs” that has been indexed by this view (foreign key)

View definitions are not stored in the database as source code. They are native functions, represented by function pointers or their equivalent. The client must register each function with its named view when the database is opened.

Table: “maps”

Each row in this table is a key/value pair emitted by a view’s map function.

Column Type Description
view_id integer View that emitted this row (foreign key)
sequence integer Revision that emitted this row (foreign key)
key text JSON-encoded emitted key
value text JSON-encoded emitted value

Table: “replicators”

Stores persistent state of replications to/from other databases. The Replicator class uses this.

Column Type Description
remote text URL of remote database
push boolean Is this a “push” replication, i.e. is ‘remote’ the destination?
last_sequence text Last sequence processed from the source database (which may or may not be local.)

Table: “localdocs”

Stores local documents. These are not replicated, don’t show up in views, and don’t store previous revisions. They are distinguished by having a document ID prefixed with “_local/”. Their main defined purpose is to store state information for replications.

Column Type Description
docid text Document ID (primary key)
revid text Current revision ID
json blob JSON contents

Table: “info”

A table that stores some persistent per-database information.

Column Type Description
key text Property name (primary key)
value text Property value

Currently defined keys are “privateUUID” and “publicUUID”, each of which has a value that’s a randomly generated string. These are used to uniquely identify the source and target databases during replication.

View

The View class is closely tied to the Database. It’s just broken out to give each view a place to store transient data (most importantly the map function pointer) and to make the API and implementation a bit clearer. Each View instance is associated with a row in the “views” table.

Instead of keeping a separate B-tree index for every view, Couchbase Lite has a single “maps” table. It contains a row for every key/value pair that was emitted by a map function of any view. There is no storage of intermediate results from the reduce function, though (at least not yet.)

Before a query, the View object compares its saved last_sequence value against the highest sequence number in the ‘revs’ table. If they don’t match, it needs to rebuild the index. To do this it first deletes map rows emitted by obsolete revisions (ones that appear as ‘parent’ values in revs added since last_sequence). Then it iterates over every rev since last_sequence, calls the map function on it, and adds any emitted key/value pairs to ‘maps’. Finally it updates its last_sequence.

BlobStore

The BlobStore stores attachments for a database. It implements a simple content-addressable store of arbitrary-sized blobs of data. A blob is given a unique key that’s its SHA-1 digest, saved to a file named after the key, and then referred to in the database by its key. After the database is compacted, all blob files whose keys no longer appear in the database are deleted.

DatabaseManager

The DatabaseManager object is fairly simple — it represents the collection of named databases owned by the server. It stores:

  • A reference to its root directory (which contains the database files)
  • A dictionary mapping names to Database objects
  • A ReplicatorManager

Server

The Server sits atop a DatabaseManager and provides thread-safety. It creates a single background thread for Couchbase Lite to run on, and its public API lets the client submit tasks that are queued to run one at a time on that background thread. (In the Objective-C implementation these are given as blocks.)

Replicator

Replicator is an abstract class representing an active replication. Its concrete subclasses are Pusher and Puller. Its properties are:

  • the local database object
  • the remote database URL
  • a flag indicating whether the replication is continuous
  • the last revision sequence number/ID transferred (persisted in both the local and remote databases)
  • ReplicatorManager

The ReplicatorManager is a (per-server) singleton that manages persistent replications. It watches the special database named `_replicator` and maps every document in it to a runtime instance of Replicator. As documents are created and updated it updates the replicators, and as replicator state changes it updates the documents.

Router

Router implements the REST API. An instance is responsible for handling a single request — it’s given the details of an HTTP request as a platform-specific object (e.g. a Cocoa NSURLRequest), interprets the method and path to determine what operation to perform, and then calls a method to perform that operation. The end result is a Response object containing the HTTP status code, headers and body.

Router doesn’t implement an HTTP server. It’s more like a servlet, taking a pre-parsed request and interpreting it. There are platform-specific higher layers (such as `CBLURLProtocol` for Cocoa) that glue Router instances into HTTP infrastructure.

Revision

A Revision is a passive value object that bundles together the data of a single revision of a document. It has an immutable document ID and revision ID. It can also have a sequence number and a JSON body, which can be set after the object is created.

The body is abstracted as a Body class, which internally maintains two different representations: raw JSON data, and a pre-parsed object hierarchy. It can be instantiated with either form, and will transparently provide either form when asked, doing the JSON parsing or generation on the fly. This can help avoid unnecessary conversions: for example, when a document’s body is fetched it comes out of the database as JSON data. If there’s no need to manipulate the document before returning it, it can be stuffed directly into the HTTP response as data without having to parse it. But if it does need to be translated to a dictionary (e.g. for a multiple-document request) the Body object will do it.

Clone this wiki locally