Skip to content
This repository was archived by the owner on Oct 17, 2022. It is now read-only.

Commit b5d1cd7

Browse files
committed
RFC for Mango on FDB
1 parent feb33ff commit b5d1cd7

File tree

1 file changed

+154
-0
lines changed

1 file changed

+154
-0
lines changed

rfcs/006-mango-fdb.md

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# Mango RFC
2+
3+
- - - -
4+
name: Formal RFC
5+
about: Submit a formal Request For Comments for consideration by the team.
6+
title: ‘Mango JSON indexes in FoundationDB’
7+
labels: rfc, discussion
8+
assignees: ‘’
9+
10+
- - - -
11+
12+
[NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
13+
14+
# Introduction
15+
16+
This document describes the data model and indexing management for Mango json indexes in FoundationDB.
17+
18+
## Abstract
19+
20+
This document details the data model for storing Mango indexes. The basic model is that we would have a namespace for storing defined indexes and then a dedicated namespace per index for the key/values for a given index. Indexes will be updated in the transaction that a document is written to FoundationDB. When an index is created on an existing database, a background task will build the index up to the Sequence that the index was created at.
21+
22+
## Requirements Language
23+
24+
[NOTE]: # ( Do not alter the section below. Follow its instructions. )
25+
26+
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
27+
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
28+
document are to be interpreted as described in
29+
[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
30+
31+
## Terminology
32+
33+
`Sequence`: a 13 byte value formed by combining the current `Incarnation` of the database and the `Versionstamp` of the transaction. Sequences are monotonically increasing even when a database is relocated across FoundationDB clusters. See (RFC002)[LINK TBD] for a full explanation.
34+
- - - -
35+
36+
# Detailed Description
37+
38+
Mango is a declarative JSON querying syntax that allows a user to retrieve documents based on a given selector. It supports defining indexes for queries which will improve query performance. In CouchDB 2.x Mango is a query layer built on top of Map/Reduce indexes. Each Mango query follows a two step process, first a subset of the selector is converted into a map query to be used with a predefined index or falling back to _all_docs if no indexes are available. Each document retrieved from the index is then matched against the query selector.
39+
40+
In a future release of CouchDB with FoundationDB the external behaviour of Mango will remain the same but internally will have its own indexes and index management. This will allow for Mango indexes to be updated in the same transaction where a write request happens - index on write. Later we can also look at adding Mango specific functionality.
41+
42+
## Data Model
43+
44+
### Index Definitions
45+
46+
A Mango index is defined as:
47+
48+
```json
49+
{
50+
name: ‘view-name’ - optional will be auto-generated
51+
index: {
52+
fields: [‘fieldA’, ‘fieldB’] - fields to be indexed
53+
},
54+
partial_filter_selector {} - optional filter to process documents before adding to the index
55+
}
56+
```
57+
58+
The above index definition would be stored in FoundationDB as:
59+
60+
`(?DATABASE, ?INDEX_DEFINITIONS, <fieldname1>, …<rest of fields>) = (<index_name>, <partial_filter_selector>, build_status, sequence)`
61+
62+
`build_status` will have two options, `active` which indicates the index is ready to service queries or `building` if the index is still being built. `sequence` is the sequence that the index is created at. Nested fields defined in the index would be stored as packed tuples.
63+
64+
### Indexes
65+
66+
Each index defined in the Index Definition would have an index key space where the database’s documents are stored and sorted via the keys defined in the index’s definition. The data model for each defined index would be:
67+
68+
`(?DATABASE, ?INDEXES, ?INDEX_NAME, <indexed_field>, …<other indexed fields>, _id) = null`
69+
70+
The `_id` is kept to avoid duplicate keys and to be used to retrieve the full document for a Mango query.
71+
For now, the value will be null, later we can look at storing covering indexes. aggregate values or materialised views.
72+
73+
### Key sorting
74+
75+
In CouchDB 2.x ICU collation is used to sort string key’s when added to the index’s b-tree. The current way of using ICU string collation won’t work with FoundationDB. To resolve this strings will be converted to an ICU sort string before being stored in FDB. This is an extra performance overhead but will only be done when one when writing a key into the index.
76+
77+
CouchDB has a defined [index collation specification](http://docs.couchdb.org/en/stable/ddocs/views/collation.html#collation-specification) that the new Mango design must adhere to. Each key added to a Mango index will be converted into a composite key or tuple with the first value in the tuple representing the type that the key so that it would be sorted correctly. Below is an example of the type keys to be used:
78+
79+
\x00 NULL
80+
\x26 False
81+
\x27 True
82+
\x30 Numbers
83+
\x40 Text converted into a sort string
84+
\x50 Array
85+
\x60 Objects
86+
87+
An example for a number key would be (\x30, 1). Just too note, Null and Boolean values won’t need to be composite keys as the type key is the value.
88+
89+
### Index Limits
90+
91+
This design has certain defined limits for it to work correctly:
92+
93+
* The index definition (name, fields and partial_filter_selector) cannot exceed 100 KB FDB value limit
94+
* The sorted keys for an index cannot exceed the 10 KB key limit
95+
* To be able to update the index in the transaction that a document is updated in, there will have to be a limit on number of Mango indexes for a database so that the transaction stays within the 10MB transaction limit. This limit is still TBD based on testing.
96+
97+
## Index building and management
98+
99+
When an index is created on an existing database, the index will need to be built for all existing documents in the database. The process for building a new index would be:
100+
101+
1. When a user defines a new index on an existing database, save the index definition along with the `sequence` the index was added at and set the `build_status` to `building` so it won’t be used to service queries.
102+
2. Any write requests (document updates) after that must read the new index definition and update the index. When updating the new index, the index writers should assume that previous versions of the document have already been indexed.
103+
3. At the same time a background process will start reading sections of the changes feed and building the index, this background process will keep processing the changes read until it reaches the sequence number that the index was saved at. Once it reaches that point, the index is up to date and `build_status` will be marked as `active` and the index will be used to service queries.
104+
4. There are some subtle behaviour around step 3 that is worth mentioning. The background process will have the 5 second transaction limit, so it will process smaller parts of the changes feed. Which means that it won’t have one consistent view of the changes feed throughout the index building process. This will lead to a conflict situation when the background process transaction is adding a document to the index while at the same time a write request has a transaction that is updating the same document. There are two possible outcomes to this, if the background process wins, the write request will get a conflict. At that point the write request will try to process the document again, read the old values for that document, remove them from the index and add the new values to the index. If the write request wins, and the background process gets a conflict, then the background process can try again, the document would have been removed from its old position in the changes feed and moved to the later position, so the background process won’t see the document and will then move on to the next one.
105+
5. An index progress tracker will also be added. This will use `doc_count` for the database, and then have a counter value that the background workers can increment with the number of documents it updated for each batch update. It would also be updated on write requests while the index is in building mode.
106+
6. Some thing to explore is splitting the building of the index across multiple worker, it should be possible to use the [`get_boundary_keys` ](https://apple.github.io/foundationdb/api-python.html?highlight=boundary_keys#fdb.locality.fdb.locality.get_boundary_keys) api call on the changes feed to get the full list of changes feed keys grouped by partition boundaries and then split that by workers.
107+
108+
## Advantages
109+
110+
* Indexes are kept up to date when documents are changed, meaning you can read your own write
111+
* Makes Mango indexes first class citizens and opens up the opportunity to create more Mango specific functionality
112+
113+
## Disadvantages
114+
115+
* FoundationDB currently does not allow CouchDB to to do the document selector matching at the shard level. However there is a discussion for this [Feature Request: Predicate pushdown](https://forums.foundationdb.org/t/feature-request-predicate-pushdown/954)
116+
117+
## Key Changes
118+
119+
* Mango indexes will be stored separately to Map/Reduce indexes.
120+
* Mango Indexes will be updated when a document is updated
121+
* A background process will built a new Mango index on an existing database
122+
* There are specific index limits mentioned in the Index Limits section.
123+
124+
Index limitations aside, this design preserves all of the existing API options
125+
for working with CouchDB documents.
126+
127+
## Applications and Modules affected
128+
129+
TBD depending on exact code layout going forward.
130+
131+
## HTTP API additions
132+
133+
None.
134+
135+
## HTTP API deprecations
136+
137+
None,
138+
139+
# Security Considerations
140+
141+
None have been identified.
142+
143+
# References
144+
145+
[Original mailing list discussion](https://lists.apache.org/thread.html/b614d41b72d98c7418aa42e5aa8e3b56f9cf1061761f912cf67b738a@%3Cdev.couchdb.apache.org%3E)
146+
147+
# Acknowledgements
148+
149+
thanks to following in participating in the design discussion
150+
151+
* @kocolosk
152+
* @willholley
153+
* @janl
154+
* @alexmiller-apple

0 commit comments

Comments
 (0)