1
1
[[search-aggregations]]
2
- == Aggregations
2
+ = Aggregations
3
3
4
+ [partintro]
5
+ --
4
6
The aggregations framework helps provide aggregated data based on a search query. It is based on simple building blocks
5
7
called aggregations, that can be composed in order to build complex summaries of the data.
6
8
@@ -11,16 +13,19 @@ query/filters of the search request).
11
13
There are many different types of aggregations, each with its own purpose and output. To better understand these types,
12
14
it is often easier to break them into two main families:
13
15
14
- _Bucketing_::
16
+ <<search-aggregations-bucket, _Bucketing_>> ::
15
17
A family of aggregations that build buckets, where each bucket is associated with a _key_ and a document
16
18
criterion. When the aggregation is executed, all the buckets criteria are evaluated on every document in
17
19
the context and when a criterion matches, the document is considered to "fall in" the relevant bucket.
18
20
By the end of the aggregation process, we'll end up with a list of buckets - each one with a set of
19
21
documents that "belong" to it.
20
22
21
- _Metric_::
23
+ <<search-aggregations-metrics, _Metric_>> ::
22
24
Aggregations that keep track and compute metrics over a set of documents.
23
25
26
+ <<search-aggregations-reducer, _Reducer_>>::
27
+ Aggregations that aggregate the output of other aggregations and their associated metrics
28
+
24
29
The interesting part comes next. Since each bucket effectively defines a document set (all documents belonging to
25
30
the bucket), one can potentially associate aggregations on the bucket level, and those will execute within the context
26
31
of that bucket. This is where the real power of aggregations kicks in: *aggregations can be nested!*
@@ -31,7 +36,7 @@ NOTE: Bucketing aggregations can have sub-aggregations (bucketing or metric). Th
31
36
another higher-level aggregation).
32
37
33
38
[float]
34
- === Structuring Aggregations
39
+ == Structuring Aggregations
35
40
36
41
The following snippet captures the basic structure of aggregations:
37
42
@@ -62,7 +67,7 @@ bucketing aggregation. For example, if you define a set of aggregations under th
62
67
sub-aggregations will be computed for the range buckets that are defined.
63
68
64
69
[float]
65
- ==== Values Source
70
+ === Values Source
66
71
67
72
Some aggregations work on values extracted from the aggregated documents. Typically, the values will be extracted from
68
73
a specific document field which is set using the `field` key for the aggregations. It is also possible to define a
@@ -89,146 +94,12 @@ perform optimizations when dealing with sorted values (for example, with the `mi
89
94
sorted, Elasticsearch will skip the iterations over all the values and rely on the first value in the list to be the
90
95
minimum value among all other values associated with the same document).
91
96
92
- [float]
93
- === Metrics Aggregations
94
-
95
- The aggregations in this family compute metrics based on values extracted in one way or another from the documents that
96
- are being aggregated. The values are typically extracted from the fields of the document (using the field data), but
97
- can also be generated using scripts.
98
-
99
- Numeric metrics aggregations are a special type of metrics aggregation which output numeric values. Some aggregations output
100
- a single numeric metric (e.g. `avg`) and are called `single-value numeric metrics aggregation`, others generate multiple
101
- metrics (e.g. `stats`) and are called `multi-value numeric metrics aggregation`. The distinction between single-value and
102
- multi-value numeric metrics aggregations plays a role when these aggregations serve as direct sub-aggregations of some
103
- bucket aggregations (some bucket aggregations enable you to sort the returned buckets based on the numeric metrics in each bucket).
104
-
105
-
106
- [float]
107
- === Bucket Aggregations
108
-
109
- Bucket aggregations don't calculate metrics over fields like the metrics aggregations do, but instead, they create
110
- buckets of documents. Each bucket is associated with a criterion (depending on the aggregation type) which determines
111
- whether or not a document in the current context "falls" into it. In other words, the buckets effectively define document
112
- sets. In addition to the buckets themselves, the `bucket` aggregations also compute and return the number of documents
113
- that "fell in" to each bucket.
114
-
115
- Bucket aggregations, as opposed to `metrics` aggregations, can hold sub-aggregations. These sub-aggregations will be
116
- aggregated for the buckets created by their "parent" bucket aggregation.
117
-
118
- There are different bucket aggregators, each with a different "bucketing" strategy. Some define a single bucket, some
119
- define fixed number of multiple buckets, and others dynamically create the buckets during the aggregation process.
120
-
121
- [float]
122
- === Reducer Aggregations
123
-
124
- coming[2.0.0]
125
-
126
- experimental[]
127
-
128
- Reducer aggregations work on the outputs produced from other aggregations rather than from document sets, adding
129
- information to the output tree. There are many different types of reducer, each computing different information from
130
- other aggregations, but these types can broken down into two families:
131
-
132
- _Parent_::
133
- A family of reducer aggregations that is provided with the output of its parent aggregation and is able
134
- to compute new buckets or new aggregations to add to existing buckets.
135
-
136
- _Sibling_::
137
- Reducer aggregations that are provided with the output of a sibling aggregation and are able to compute a
138
- new aggregation which will be at the same level as the sibling aggregation.
139
-
140
- Reducer aggregations can reference the aggregations they need to perform their computation by using the `buckets_paths`
141
- parameter to indicate the paths to the required metrics. The syntax for defining these paths can be found in the
142
- <<search-aggregations-bucket-terms-aggregation-order, terms aggregation order>> section.
143
-
144
- ?????? SHOULD THE SECTION ABOUT DEFINING AGGREGATION PATHS
145
- BE IN THIS PAGE AND REFERENCED FROM THE TERMS AGGREGATION DOCUMENTATION ???????
146
-
147
- Reducer aggregations cannot have sub-aggregations but depending on the type it can reference another reducer in the `buckets_path`
148
- allowing reducers to be chained.
149
-
150
- NOTE: Because reducer aggregations only add to the output, when chaining reducer aggregations the output of each reducer will be
151
- included in the final output.
152
-
153
- [float]
154
- === Caching heavy aggregations
155
-
156
- Frequently used aggregations (e.g. for display on the home page of a website)
157
- can be cached for faster responses. These cached results are the same results
158
- that would be returned by an uncached aggregation -- you will never get stale
159
- results.
160
-
161
- See <<index-modules-shard-query-cache>> for more details.
162
-
163
- [float]
164
- === Returning only aggregation results
165
-
166
- There are many occasions when aggregations are required but search hits are not. For these cases the hits can be ignored by
167
- setting `size=0`. For example:
168
-
169
- [source,js]
170
- --------------------------------------------------
171
- $ curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{
172
- "size": 0,
173
- "aggregations": {
174
- "my_agg": {
175
- "terms": {
176
- "field": "text"
177
- }
178
- }
179
- }
180
- }
181
- '
182
- --------------------------------------------------
183
-
184
- Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient.
185
-
186
- [float]
187
- === Metadata
188
-
189
- You can associate a piece of metadata with individual aggregations at request time that will be returned in place
190
- at response time.
191
-
192
- Consider this example where we want to associate the color blue with our `terms` aggregation.
193
-
194
- [source,js]
195
- --------------------------------------------------
196
- {
197
- ...
198
- aggs": {
199
- "titles": {
200
- "terms": {
201
- "field": "title"
202
- },
203
- "meta": {
204
- "color": "blue"
205
- },
206
- }
207
- }
208
- }
209
- --------------------------------------------------
210
-
211
- Then that piece of metadata will be returned in place for our `titles` terms aggregation
212
-
213
- [source,js]
214
- --------------------------------------------------
215
- {
216
- ...
217
- "aggregations": {
218
- "titles": {
219
- "meta": {
220
- "color" : "blue"
221
- },
222
- "buckets": [
223
- ]
224
- }
225
- }
226
- }
227
- --------------------------------------------------
97
+ --
228
98
229
99
include::aggregations/metrics.asciidoc[]
230
100
231
101
include::aggregations/bucket.asciidoc[]
232
102
233
103
include::aggregations/reducer.asciidoc[]
234
104
105
+ include::aggregations/misc.asciidoc[]
0 commit comments