1
1
# The CAP FAQ
2
2
3
+ Version 1.0, May 9th 2013
4
+ By: Henry Robinson / henry.robinson@gmail.com / @henryr
5
+ <pre ><a href =" http:://the-paper-trail.org/ " >http://the-paper-trail.org/</a ></pre >
6
+
7
+ ## 0. What is this document?
8
+
9
+ No subject appears to be more controversial to distributed systems
10
+ engineers than the oft-quoted, oft-misunderstood CAP theorem. The
11
+ purpose of this FAQ is to explain what is known about CAP, so as to
12
+ help those new to the theorem get up to speed quickly, and to settle
13
+ some common misconceptions or points of disagreement.
14
+
15
+ Of course, there's every possibility I've made superficial or
16
+ completely thorough mistakes here. Corrections and comments are
17
+ welcome: <a href =" mailto:henry.robinson+cap@gmail.com " >let me have
18
+ them</a >.
19
+
20
+ There are some questions I still intend to answer. For example
21
+
22
+ * * What's the relationship between CAP and performance?*
23
+ * * What does CAP mean to me as an engineer?*
24
+ * * What's the relationship between CAP and ACID?*
25
+
26
+ Please suggest more.
27
+
3
28
## 1. Where did the CAP Theorem come from?
4
29
5
30
Dr. Eric Brewer gave a keynote speech at the Principles of Distributed
6
- Computing conference in 2000 called 'Towards Robust Distributed Systems'.
31
+ Computing conference in 2000 called 'Towards Robust Distributed
32
+ Systems' [ 1] . In it he posed his 'CAP Theorem' - at the time unproven - which
33
+ illustrated the tensions between being correct and being always
34
+ available in distributed systems.
35
+
36
+ Two years later, Seth Gilbert and Professor Nancy Lynch - researchers
37
+ in distributed systems at MIT - formalised and proved the conjecture
38
+ in their paper “Brewer's conjecture and the feasibility of consistent,
39
+ available, partition-tolerant web services” [ 2] .
7
40
8
41
[ 1] http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
42
+ [ 2] http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf
9
43
10
44
## 2. What does the CAP Theorem actually say?
11
45
@@ -17,29 +51,46 @@ satisfies all of the following three properties:
17
51
* Consistency - will all executions of reads and writes seen by all nodes be _ atomic_ or _ linearizably_ consistent?
18
52
* Partition tolerance - the network is allowed to drop any messages.
19
53
20
- The next few items define some of the terms.
54
+ The next few items define any unfamiliar terms.
55
+
56
+ More informally, the CAP theorem tells us that we can't build a
57
+ database that both responds to every request and returns the results
58
+ that you would expect every time. It's an _ impossibility_ result - it
59
+ tells us that something we might want to do is actually provably out
60
+ of reach. It's important now because it is directly applicable to the
61
+ many, many distributed systems which have been and are being built in
62
+ the last few years, but it is not a death knell: it does _ not_ mean
63
+ that we cannot build useful systems while working within these
64
+ constraints.
65
+
66
+ The devil is in the details however. Before you start crying 'yes, but
67
+ what about...', make sure you understand the following about exactly
68
+ what the CAP theorem does and does not allow.
21
69
22
70
## 3. What is 'read-write storage'?
23
71
24
- The CAP Theorem specifically concerns itself with a theoretical
72
+ CAP specifically concerns itself with a theoretical
25
73
construct called a _ register_ . A register is a data structure with two
26
74
operations:
27
75
28
76
* set(X) sets the value of the register to X
29
- * get() returns the value in the register
77
+ * get() returns the last value set in the register
78
+
79
+ A key-value store can be modelled as a collection of registers. Even
80
+ though registers appear very simple, they capture the essence of what
81
+ many distributed systems want to do - write data and read it back.
30
82
31
- h2. 4. What does _ atomic_ (or _ linearizable_ ) mean?
83
+ ## 4. What does _ atomic_ (or _ linearizable_ ) mean?
32
84
33
85
Atomic, or linearizable, consistency is a guarantee about what values
34
- it's ok to return when a client performs get() operations.
86
+ it's ok to return when a client performs get() operations. The idea is
87
+ that the register appears to all clients as though it ran on just one
88
+ machine, and responded to operations in the order they arrive.
35
89
36
- The basic idea is this: consider an execution consisting the total set
37
- of operations performed by all clients, potentially concurrently. The
38
- results of those operations must, under atomic consistency, be
39
- equivalent to a single serial (in order, one after the other)
40
- execution of all operations. The idea is that the register appears as
41
- though it ran on just one machine, and responded to operations in the
42
- order they arrive.
90
+ Consider an execution consisting the total set of operations performed
91
+ by all clients, potentially concurrently. The results of those
92
+ operations must, under atomic consistency, be equivalent to a single
93
+ serial (in order, one after the other) execution of all operations.
43
94
44
95
This guarantee is very strong. It rules out, amongst other guarantees,
45
96
_ eventual consistency_ , which allows a delay before a write becomes
@@ -49,7 +100,35 @@ visible. So under EC, you might have:
49
100
50
101
But this execution is invalid under atomic consistency.
51
102
52
- For more information: [ linearizability reference]
103
+ Atomic consistency also ensures that external communication about the
104
+ value of a register is respected. That is, if I read X and tell you
105
+ about it, you can go to the register and read X for yourself. It's
106
+ possible under slightly weaker guarantees (_ serializability_ for
107
+ example) for that not to be true. In the following we write A:<set or
108
+ get> to mean that client A executes the following operation.
109
+
110
+ B:set(5), A:set(10), A:get() = 10, B:get() = 10
111
+
112
+ This is an atomic history. But the following is not:
113
+
114
+ B:set(5), A:set(10), A:get() = 10, B:get() = 5
115
+
116
+ even though it is equivalent to the following serial history:
117
+
118
+ B:set(5), B:get() = 5, A:set(10), B:get() = 10
119
+
120
+ In the second example, if A tells B about the value of the register
121
+ (10) after it does its get(), B will falsely believe that some
122
+ third-party has written 5 between A: get () and B: get (). If external
123
+ communication isn't allowed, B cannot know about A: set , and so sees a
124
+ consistent view of the register state; it's as if B: get really did
125
+ happen before A: set .
126
+
127
+ Wikipedia [ 1] has more information. Maurice Herlihy's original paper
128
+ from 1990 is available at [ 2] .
129
+
130
+ [ 1] http://en.wikipedia.org/wiki/Linearizability
131
+ [ 2] http://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf
53
132
54
133
## 4. What does _ asynchronous_ mean?
55
134
@@ -86,36 +165,53 @@ messages are delivered between two sets of nodes. This is a more
86
165
restrictive failure model. We'll call these kinds of partitions _ total
87
166
partitions_ .
88
167
89
- The proof of the CAP Theorem relied on a total partition. In practice,
168
+ The proof of CAP relied on a total partition. In practice,
90
169
these are arguably the most likely since all messages may flow through
91
170
one component; if that fails then message loss is usually total
92
171
between two nodes.
93
172
94
- ## When does a system have to give up C or A?
173
+ ## 7. Why is CAP true?
174
+
175
+ The basic idea is that if a client writes to one side of a partition,
176
+ any reads that go to the other side of that partition can't possibly
177
+ know about the most recent write. Now you're faced with a choice: do
178
+ you respond to the reads with potentially stale information, or do you
179
+ wait (potentially forever) to hear from the other side of the
180
+ partition and compromise availability?
181
+
182
+ This is a proof by _ construction_ - we demonstrate a single situation
183
+ where a system cannot be consistent and available. One reason that CAP
184
+ gets some press is that this constructed scenario is not completely
185
+ unrealistic. It is not uncommon for a total partition to occur if
186
+ networking equipment should fail.
95
187
96
- The CAP Theorem only guarantees that there is _ some_ circumstance in
97
- which a system must give up either C or A. Let's call that
98
- circumstance a _ critical condition_ . The theorem doesn't say anything
99
- about how likely that critical condition is. Both C and A are strong
100
- guarantees: they hold only if 100% of operations meet their
101
- requirements. A single inconsistent read, or unavailable write,
102
- invalidates either C or A.
188
+ ## 8. When does a system have to give up C or A?
189
+
190
+ CAP only guarantees that there is _ some_ circumstance in which a
191
+ system must give up either C or A. Let's call that circumstance a
192
+ _ critical condition_ . The theorem doesn't say anything about how
193
+ likely that critical condition is. Both C and A are strong guarantees:
194
+ they hold only if 100% of operations meet their requirements. A single
195
+ inconsistent read, or unavailable write, invalidates either C or
196
+ A. But until that critical condition is met, a system can be happily
197
+ consistent _ and_ available and not contravene any known laws.
103
198
104
199
Since most distributed systems are long running, and may see millions
105
- of requests in their lifetime, the CAP Theorem tells us to be
200
+ of requests in their lifetime, CAP tells us to be
106
201
cautious: there's a good chance that you'll realistically hit one of
107
202
these critical conditions, and it's prudent to understand how your
108
203
system will fail to meet either C or A.
109
204
110
- ##. Why do some people get annoyed when I characterise my system as CA?
205
+ ## 9 . Why do some people get annoyed when I characterise my system as CA?
111
206
112
207
Brewer's keynote, the Gilbert paper, and many other treatments, places
113
208
C, A and P on an equal footing as desirable properties of an
114
- implementation. However, this is often considered to be a misleading
115
- presentation, since you cannot build 'partition tolerance'; your
116
- system either might experience partitions or it won't.
209
+ implementation and effectively say 'choose two!'. However, this is
210
+ often considered to be a misleading presentation, since you cannot
211
+ build - or choose! - 'partition tolerance': your system either might experience
212
+ partitions or it won't.
117
213
118
- The CAP Theorem is better understood as describing the tradeoffs you
214
+ CAP is better understood as describing the tradeoffs you
119
215
have to make when you are building a system that may suffer
120
216
partitions. In practice, this is every distributed system: there is no
121
217
100% reliable network. So (at least in the distributed context) there
@@ -125,24 +221,29 @@ therefore you must at some point compromise C or A.
125
221
Therefore it's arguably more instructive to rewrite the theorem as the
126
222
following:
127
223
128
- Possibility of Partitions => Not (C and A)
224
+ < pre > Possibility of Partitions => Not (C and A)</ pre >
129
225
130
226
i.e. if your system may experience partitions, you can not always be C
131
227
and A.
132
228
133
229
There are some systems that won't experience partitions - single-site
134
230
databases, for example. These systems aren't generally relevant to the
135
- contexts in which CAP is most useful.
231
+ contexts in which CAP is most useful. If you describe your distributed
232
+ database as 'CA', you are misunderstanding something.
136
233
137
- ## 8 . What about when messages don't get lost?
234
+ ## 10 . What about when messages don't get lost?
138
235
139
236
A perhaps surprising result from the Gilbert paper is that no
140
237
implementation of an atomic register in an asynchronous network can be
141
- available at all times, and consistent even when no messages are lost.
238
+ available at all times, and consistent only when no messages are lost.
142
239
143
- This result depends upon the asynchronous network property.
240
+ This result depends upon the asynchronous network property, the idea
241
+ being that it is impossible to tell if a message has been dropped and
242
+ therefore a node cannot wait indefinitely for a response while still
243
+ maintaining availability, however if it responds too early it might be
244
+ inconsistent.
144
245
145
- ## 9 . Is my network really asynchronous?
246
+ ## 11 . Is my network really asynchronous?
146
247
147
248
Arguably, yes. Different networks have vastly differing characteristics.
148
249
153
254
154
255
then your network may be considered _ asynchronous_ .
155
256
156
- ## 10. What, if any, is the relationship between FLP and CAP?
257
+ Gilbert and Lynch also proved that in a _ partially-synchronous_
258
+ system, where nodes have shared but not synchronised clocks and there
259
+ is a bound on the processing time of every message, that it is still
260
+ impossible to implement available atomic storage.
261
+
262
+ However, the result from #8 does _ not_ hold in the
263
+ partially-synchronous model; it is possible to implement atomic
264
+ storage that is available all the time, and consistent when all
265
+ messages are delivered.
266
+
267
+ ## 12. What, if any, is the relationship between FLP and CAP?
268
+
269
+ The Fischer, Lynch and Patterson theorem ('FLP') (see [ 1] for a link
270
+ to the paper and a proof explanation) is an extraordinary
271
+ impossibility result from nearly thirty years ago, which determined
272
+ that the problem of consensus - having all nodes agree on a common
273
+ value - is unsolvable in general in asynchronous networks where one
274
+ node might fail.
157
275
158
276
The FLP result is not directly related to CAP, although they are
159
277
similar in some respects. Both are impossibility results about
@@ -169,11 +287,16 @@ from CAP:
169
287
* FLP deals with _ consensus_ , which is a similar but different problem
170
288
to _ atomic storage_ .
171
289
172
- ## 11. Are C and A 'spectrums'?
290
+ For a bit more on this topic, consult the blog post at [ 2] .
291
+
292
+ [ 1] http://the-paper-trail.org/blog/a-brief-tour-of-flp-impossibility/
293
+ [ 2] http://the-paper-trail.org/blog/flp-and-cap-arent-the-same-thing/
294
+
295
+ ## 13. Are C and A 'spectrums'?
173
296
174
297
It is possible to relax both consistency and availability guarantees
175
298
from the strong requirements that CAP imposes and get useful
176
- systems. In fact, the whole point of the CAP Theorem is that you
299
+ systems. In fact, the whole point of CAP is that you
177
300
_ must_ do this, and any system you have designed and built relaxes one
178
301
or both of these guarantees. The onus is on you to figure out when,
179
302
and how, this occurs.
@@ -183,6 +306,9 @@ whom consistency is of the utmost importance, like ZooKeeper. Other
183
306
systems, like Amazon's Dynamo, relax consistency in order to maintain
184
307
high degrees of availability.
185
308
309
+ Once you weaken any of the assumptions made in the statement or proof
310
+ of CAP, you have to start again when it comes to proving an
311
+ impossibility result.
186
312
187
313
## 14. Is a failed machine the same as a partitioned one?
188
314
@@ -193,7 +319,18 @@ without having any machines fail).
193
319
194
320
It is possible to prove a similar result about the impossibility of
195
321
atomic storage in an asynchronous network when there are up to N-1
196
- failures
322
+ failures. This result has ramifications about the tradeoff between how
323
+ many nodes you write to (which is a performance concern) and how fault
324
+ tolerant you are (which is a reliability concern).
325
+
326
+ ## 15. Is a slow machine the same as a partitioned one?
327
+
328
+ No: messages eventually get delivered to a slow machine, but they
329
+ never get delivered to a totally partitioned one. However, slow
330
+ machines play a significant role in making it very hard to distinguish
331
+ between lost messages (or failed machines) and a slow machine. This
332
+ difficulty is right at the heart of why CAP, FLP and other results are
333
+ true.
197
334
198
335
## 16. Have I 'got around' or 'beaten' the CAP theorem?
199
336
0 commit comments