Skip to content

Commit cae71cd

Browse files
authored
Document Seq No powered optimistic concurrency control (#37284)
Add documentation to describe the new sequence number powered optimistic concurrency control Relates #36148 Relates #10708
1 parent 1eba1d1 commit cae71cd

File tree

5 files changed

+227
-83
lines changed

5 files changed

+227
-83
lines changed

docs/reference/docs.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,5 @@ include::docs/termvectors.asciidoc[]
5050
include::docs/multi-termvectors.asciidoc[]
5151

5252
include::docs/refresh.asciidoc[]
53+
54+
include::docs/concurrency-control.asciidoc[]

docs/reference/docs/bulk.asciidoc

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,17 @@ size for your particular workload.
197197
If using the HTTP API, make sure that the client does not send HTTP
198198
chunks, as this will slow things down.
199199

200+
[float]
201+
[[bulk-optimistic-concurrency-control]]
202+
=== Optimistic Concurrency Control
203+
204+
Each `index` and `delete` action within a bulk API call may include the
205+
`if_seq_no` and `if_primary_term` parameters in their respective action
206+
and meta data lines. The `if_seq_no` and `if_primary_term` parameters control
207+
how operations are executed, based on the last modification to existing
208+
documents. See <<optimistic-concurrency-control>> for more details.
209+
210+
200211
[float]
201212
[[bulk-versioning]]
202213
=== Versioning
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
[[optimistic-concurrency-control]]
2+
== Optimistic concurrency control
3+
4+
Elasticsearch is distributed. When documents are created, updated, or deleted,
5+
the new version of the document has to be replicated to other nodes in the cluster.
6+
Elasticsearch is also asynchronous and concurrent, meaning that these replication
7+
requests are sent in parallel, and may arrive at their destination out of sequence.
8+
Elasticsearch needs a way of ensuring that an older version of a document never
9+
overwrites a newer version.
10+
11+
12+
To ensure an older version of a document doesn't overwrite a newer version, every
13+
operation performed to a document is assigned a sequence number by the primary
14+
shard that coordinates that change. The sequence number is increased with each
15+
operation and thus newer operations are guaranteed to have a higher sequence
16+
number than older operations. Elasticsearch can then use the sequence number of
17+
operations to make sure they never override a newer document version is never
18+
overridden by a change that has a smaller sequence number assigned to it.
19+
20+
For example, the following indexing command will create a document and assign it
21+
an initial sequence number and primary term:
22+
23+
[source,js]
24+
--------------------------------------------------
25+
PUT products/_doc/1567
26+
{
27+
"product" : "r2d2",
28+
"details" : "A resourceful astromech droid"
29+
}
30+
--------------------------------------------------
31+
// CONSOLE
32+
33+
You can see the assigned sequence number and primary term in the
34+
the `_seq_no` and `_primary_term` fields of the response:
35+
36+
[source,js]
37+
--------------------------------------------------
38+
{
39+
"_shards" : {
40+
"total" : 2,
41+
"failed" : 0,
42+
"successful" : 1
43+
},
44+
"_index" : "products",
45+
"_type" : "_doc",
46+
"_id" : "1567",
47+
"_version" : 1,
48+
"_seq_no" : 362,
49+
"_primary_term" : 2,
50+
"result" : "created"
51+
}
52+
--------------------------------------------------
53+
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/]
54+
55+
56+
Elasticsearch keeps tracks of the sequence number and primary of the last
57+
operation to have changed each of the document it stores. The sequence number
58+
and primary term are returned in the `_seq_no` and `_primary_term` fields in
59+
the response of the <<docs-get,GET API>>:
60+
61+
[source,js]
62+
--------------------------------------------------
63+
GET products/_doc/1567
64+
--------------------------------------------------
65+
// CONSOLE
66+
// TEST[continued]
67+
68+
returns:
69+
70+
[source,js]
71+
--------------------------------------------------
72+
{
73+
"_index" : "products",
74+
"_type" : "_doc",
75+
"_id" : "1567",
76+
"_version" : 1,
77+
"_seq_no" : 362,
78+
"_primary_term" : 2,
79+
"found": true,
80+
"_source" : {
81+
"product" : "r2d2",
82+
"details" : "A resourceful astromech droid"
83+
}
84+
}
85+
--------------------------------------------------
86+
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/]
87+
88+
89+
Note: The <<search-search,Search API>> can return the `_seq_no` and `_primary_term`
90+
for each search hit by requesting the `_seq_no` and `_primary_term` <<search-request-docvalue-fields,Doc Value Fields>>.
91+
92+
The sequence number and the primary term uniquely identify a change. By noting down
93+
the sequence number and primary term returned, you can make sure to only change the
94+
document if no other change was made to it since you retrieved it. This
95+
is done by setting the `if_seq_no` and `if_primary_term` parameters of either the
96+
<<docs-index_,Index API>> or the <<docs-delete,Delete API>>.
97+
98+
For example, the following indexing call will make sure to add a tag to the
99+
document without losing any potential change to the description or an addition
100+
of another tag by another API:
101+
102+
[source,js]
103+
--------------------------------------------------
104+
PUT products/_doc/1567?if_seq_no=362&if_primary_term=2
105+
{
106+
"product" : "r2d2",
107+
"details" : "A resourceful astromech droid",
108+
"tags": ["droid"]
109+
}
110+
--------------------------------------------------
111+
// CONSOLE
112+
// TEST[continued]
113+
// TEST[catch: conflict]
114+

docs/reference/docs/delete.asciidoc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,16 @@ The result of the above delete operation is:
3535
// TESTRESPONSE[s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
3636
// TESTRESPONSE[s/"_seq_no" : 5/"_seq_no" : $body._seq_no/]
3737

38+
[float]
39+
[[optimistic-concurrency-control-delete]]
40+
=== Optimistic concurrency control
41+
42+
Delete operations can be made optional and only be performed if the last
43+
modification to the document was assigned the sequence number and primary
44+
term specified by the `if_seq_no` and `if_primary_term` parameters. If a
45+
mismatch is detected, the operation will result in a `VersionConflictException`
46+
and a status code of 409. See <<optimistic-concurrency-control>> for more details.
47+
3848
[float]
3949
[[delete-versioning]]
4050
=== Versioning

docs/reference/docs/index_.asciidoc

Lines changed: 90 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -79,89 +79,6 @@ Automatic index creation can include a pattern based white/black list,
7979
for example, set `action.auto_create_index` to `+aaa*,-bbb*,+ccc*,-*` (+
8080
meaning allowed, and - meaning disallowed).
8181

82-
[float]
83-
[[index-versioning]]
84-
=== Versioning
85-
86-
Each indexed document is given a version number. The associated
87-
`version` number is returned as part of the response to the index API
88-
request. The index API optionally allows for
89-
http://en.wikipedia.org/wiki/Optimistic_concurrency_control[optimistic
90-
concurrency control] when the `version` parameter is specified. This
91-
will control the version of the document the operation is intended to be
92-
executed against. A good example of a use case for versioning is
93-
performing a transactional read-then-update. Specifying a `version` from
94-
the document initially read ensures no changes have happened in the
95-
meantime. For example:
96-
97-
[source,js]
98-
--------------------------------------------------
99-
PUT twitter/_doc/1?version=2
100-
{
101-
"message" : "elasticsearch now has versioning support, double cool!"
102-
}
103-
--------------------------------------------------
104-
// CONSOLE
105-
// TEST[continued]
106-
// TEST[catch: conflict]
107-
108-
*NOTE:* versioning is completely real time, and is not affected by the
109-
near real time aspects of search operations. If no version is provided,
110-
then the operation is executed without any version checks.
111-
112-
By default, internal versioning is used that starts at 1 and increments
113-
with each update, deletes included. Optionally, the version number can be
114-
supplemented with an external value (for example, if maintained in a
115-
database). To enable this functionality, `version_type` should be set to
116-
`external`. The value provided must be a numeric, long value greater or equal to 0,
117-
and less than around 9.2e+18. When using the external version type, instead
118-
of checking for a matching version number, the system checks to see if
119-
the version number passed to the index request is greater than the
120-
version of the currently stored document. If true, the document will be
121-
indexed and the new version number used. If the value provided is less
122-
than or equal to the stored document's version number, a version
123-
conflict will occur and the index operation will fail.
124-
125-
WARNING: External versioning supports the value 0 as a valid version number.
126-
This allows the version to be in sync with an external versioning system
127-
where version numbers start from zero instead of one. It has the side effect
128-
that documents with version number equal to zero cannot neither be updated
129-
using the <<docs-update-by-query,Update-By-Query API>> nor be deleted
130-
using the <<docs-delete-by-query,Delete By Query API>> as long as their
131-
version number is equal to zero.
132-
133-
A nice side effect is that there is no need to maintain strict ordering
134-
of async indexing operations executed as a result of changes to a source
135-
database, as long as version numbers from the source database are used.
136-
Even the simple case of updating the Elasticsearch index using data from
137-
a database is simplified if external versioning is used, as only the
138-
latest version will be used if the index operations are out of order for
139-
whatever reason.
140-
141-
[float]
142-
==== Version types
143-
144-
Next to the `internal` & `external` version types explained above, Elasticsearch
145-
also supports other types for specific use cases. Here is an overview of
146-
the different version types and their semantics.
147-
148-
`internal`:: only index the document if the given version is identical to the version
149-
of the stored document.
150-
151-
`external` or `external_gt`:: only index the document if the given version is strictly higher
152-
than the version of the stored document *or* if there is no existing document. The given
153-
version will be used as the new version and will be stored with the new document. The supplied
154-
version must be a non-negative long number.
155-
156-
`external_gte`:: only index the document if the given version is *equal* or higher
157-
than the version of the stored document. If there is no existing document
158-
the operation will succeed as well. The given version will be used as the new version
159-
and will be stored with the new document. The supplied version must be a non-negative long number.
160-
161-
*NOTE*: The `external_gte` version type is meant for special use cases and
162-
should be used with care. If used incorrectly, it can result in loss of data.
163-
There is another option, `force`, which is deprecated because it can cause
164-
primary and replica shards to diverge.
16582

16683
[float]
16784
[[operation-type]]
@@ -238,6 +155,16 @@ The result of the above index operation is:
238155
--------------------------------------------------
239156
// TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/ s/"successful" : 2/"successful" : 1/]
240157

158+
[float]
159+
[[optimistic-concurrency-control-index]]
160+
=== Optimistic concurrency control
161+
162+
Index operations can be made optional and only be performed if the last
163+
modification to the document was assigned the sequence number and primary
164+
term specified by the `if_seq_no` and `if_primary_term` parameters. If a
165+
mismatch is detected, the operation will result in a `VersionConflictException`
166+
and a status code of 409. See <<optimistic-concurrency-control>> for more details.
167+
241168
[float]
242169
[[index-routing]]
243170
=== Routing
@@ -380,3 +307,83 @@ PUT twitter/_doc/1?timeout=5m
380307
}
381308
--------------------------------------------------
382309
// CONSOLE
310+
311+
[float]
312+
[[index-versioning]]
313+
=== Versioning
314+
315+
Each indexed document is given a version number. By default,
316+
internal versioning is used that starts at 1 and increments
317+
with each update, deletes included. Optionally, the version number can be
318+
set to an external value (for example, if maintained in a
319+
database). To enable this functionality, `version_type` should be set to
320+
`external`. The value provided must be a numeric, long value greater or equal to 0,
321+
and less than around 9.2e+18.
322+
323+
When using the external version type, the system checks to see if
324+
the version number passed to the index request is greater than the
325+
version of the currently stored document. If true, the document will be
326+
indexed and the new version number used. If the value provided is less
327+
than or equal to the stored document's version number, a version
328+
conflict will occur and the index operation will fail. For example:
329+
330+
[source,js]
331+
--------------------------------------------------
332+
PUT twitter/_doc/1?version=2&version_type=external
333+
{
334+
"message" : "elasticsearch now has versioning support, double cool!"
335+
}
336+
--------------------------------------------------
337+
// CONSOLE
338+
// TEST[continued]
339+
340+
*NOTE:* versioning is completely real time, and is not affected by the
341+
near real time aspects of search operations. If no version is provided,
342+
then the operation is executed without any version checks.
343+
344+
The above will succeed since the the supplied version of 2 is higher than
345+
the current document version of 1. If the document was already updated
346+
and it's version was set to 2 or higher, the indexing command will fail
347+
and result in a conflict (409 http status code).
348+
349+
WARNING: External versioning supports the value 0 as a valid version number.
350+
This allows the version to be in sync with an external versioning system
351+
where version numbers start from zero instead of one. It has the side effect
352+
that documents with version number equal to zero cannot neither be updated
353+
using the <<docs-update-by-query,Update-By-Query API>> nor be deleted
354+
using the <<docs-delete-by-query,Delete By Query API>> as long as their
355+
version number is equal to zero.
356+
357+
A nice side effect is that there is no need to maintain strict ordering
358+
of async indexing operations executed as a result of changes to a source
359+
database, as long as version numbers from the source database are used.
360+
Even the simple case of updating the Elasticsearch index using data from
361+
a database is simplified if external versioning is used, as only the
362+
latest version will be used if the index operations are out of order for
363+
whatever reason.
364+
365+
[float]
366+
==== Version types
367+
368+
Next to the `external` version type explained above, Elasticsearch
369+
also supports other types for specific use cases. Here is an overview of
370+
the different version types and their semantics.
371+
372+
`internal`:: only index the document if the given version is identical to the version
373+
of the stored document.
374+
375+
`external` or `external_gt`:: only index the document if the given version is strictly higher
376+
than the version of the stored document *or* if there is no existing document. The given
377+
version will be used as the new version and will be stored with the new document. The supplied
378+
version must be a non-negative long number.
379+
380+
`external_gte`:: only index the document if the given version is *equal* or higher
381+
than the version of the stored document. If there is no existing document
382+
the operation will succeed as well. The given version will be used as the new version
383+
and will be stored with the new document. The supplied version must be a non-negative long number.
384+
385+
*NOTE*: The `external_gte` version type is meant for special use cases and
386+
should be used with care. If used incorrectly, it can result in loss of data.
387+
There is another option, `force`, which is deprecated because it can cause
388+
primary and replica shards to diverge.
389+

0 commit comments

Comments
 (0)