Skip to content

Commit 09e9ee2

Browse files
authored
[DOCS] Fix broken images (#126648) (#126741)
1 parent c9832dd commit 09e9ee2

35 files changed

+53
-168
lines changed

docs/images/create-index-template.png

-86 KB
Binary file not shown.

docs/images/hybrid-architecture.png

-145 KB
Binary file not shown.
-107 KB
Binary file not shown.
-729 KB
Binary file not shown.
-181 KB
Binary file not shown.

docs/images/token-graph-dns-invalid-ex.svg

Lines changed: 0 additions & 72 deletions
This file was deleted.

docs/images/token-graph-dns-synonym-ex.svg

Lines changed: 0 additions & 72 deletions
This file was deleted.
-157 KB
Binary file not shown.

docs/reference/aggregations/_snippets/search-aggregations-metrics-cardinality-aggregation-explanation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ For a precision threshold of `c`, the implementation that we are using requires
66

77
The following chart shows how the error varies before and after the threshold:
88

9-
![cardinality error](/images/cardinality_error.png "")
9+
![cardinality error](/reference/query-languages/images/cardinality_error.png "")
1010

1111
For all 3 thresholds, counts have been accurate up to the configured threshold. Although not guaranteed,
1212
this is likely to be the case. Accuracy in practice depends on the dataset in question. In general,

docs/reference/aggregations/_snippets/search-aggregations-metrics-percentile-aggregation-approximate.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,6 @@ When using this metric, there are a few guidelines to keep in mind:
1212

1313
The following chart shows the relative error on a uniform distribution depending on the number of collected values and the requested percentile:
1414

15-
![percentiles error](/images/percentiles_error.png "")
15+
![percentiles error](/reference/query-languages/images/percentiles_error.png "")
1616

1717
It shows how precision is better for extreme percentiles. The reason why error diminishes for large number of values is that the law of large numbers makes the distribution of values more and more uniform and the t-digest tree can do a better job at summarizing it. It would not be the case on more skewed distributions.

docs/reference/aggregations/search-aggregations-metrics-cardinality-aggregation.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,23 @@ Computing exact counts requires loading values into a hash set and returning its
6565

6666
This `cardinality` aggregation is based on the [HyperLogLog++](https://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/40671.pdf) algorithm, which counts based on the hashes of the values with some interesting properties:
6767

68-
:::{include} _snippets/search-aggregations-metrics-cardinality-aggregation-explanation.md
69-
:::
68+
* configurable precision, which decides on how to trade memory for accuracy,
69+
* excellent accuracy on low-cardinality sets,
70+
* fixed memory usage: no matter if there are tens or billions of unique values, memory usage only depends on the configured precision.
7071

72+
For a precision threshold of `c`, the implementation that we are using requires about `c * 8` bytes.
73+
74+
The following chart shows how the error varies before and after the threshold:
75+
76+
![cardinality error](/reference/aggregations/images/cardinality_error.png "")
77+
78+
For all 3 thresholds, counts have been accurate up to the configured threshold. Although not guaranteed,
79+
this is likely to be the case. Accuracy in practice depends on the dataset in question. In general,
80+
most datasets show consistently good accuracy. Also note that even with a threshold as low as 100,
81+
the error remains very low (1-6% as seen in the above graph) even when counting millions of items.
82+
83+
The HyperLogLog++ algorithm depends on the leading zeros of hashed values, the exact distributions of
84+
hashes in a dataset can affect the accuracy of the cardinality.
7185

7286
## Pre-computed hashes [_pre_computed_hashes]
7387

docs/reference/aggregations/search-aggregations-metrics-percentile-aggregation.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -175,8 +175,23 @@ GET latency/_search
175175

176176
## Percentiles are (usually) approximate [search-aggregations-metrics-percentile-aggregation-approximation]
177177

178-
:::{include} /reference/aggregations/_snippets/search-aggregations-metrics-percentile-aggregation-approximate.md
179-
:::
178+
There are many different algorithms to calculate percentiles. The naive implementation simply stores all the values in a sorted array. To find the 50th percentile, you simply find the value that is at `my_array[count(my_array) * 0.5]`.
179+
180+
Clearly, the naive implementation does not scale — the sorted array grows linearly with the number of values in your dataset. To calculate percentiles across potentially billions of values in an Elasticsearch cluster, *approximate* percentiles are calculated.
181+
182+
The algorithm used by the `percentile` metric is called TDigest (introduced by Ted Dunning in [Computing Accurate Quantiles using T-Digests](https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf)).
183+
184+
When using this metric, there are a few guidelines to keep in mind:
185+
186+
* Accuracy is proportional to `q(1-q)`. This means that extreme percentiles (e.g. 99%) are more accurate than less extreme percentiles, such as the median
187+
* For small sets of values, percentiles are highly accurate (and potentially 100% accurate if the data is small enough).
188+
* As the quantity of values in a bucket grows, the algorithm begins to approximate the percentiles. It is effectively trading accuracy for memory savings. The exact level of inaccuracy is difficult to generalize, since it depends on your data distribution and volume of data being aggregated
189+
190+
The following chart shows the relative error on a uniform distribution depending on the number of collected values and the requested percentile:
191+
192+
![percentiles error](images/percentiles_error.png "")
193+
194+
It shows how precision is better for extreme percentiles. The reason why error diminishes for large number of values is that the law of large numbers makes the distribution of values more and more uniform and the t-digest tree can do a better job at summarizing it. It would not be the case on more skewed distributions.
180195

181196
::::{warning}
182197
Percentile aggregations are also [non-deterministic](https://en.wikipedia.org/wiki/Nondeterministic_algorithm). This means you can get slightly different results using the same data.

docs/reference/query-languages/eql/eql-syntax.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -788,7 +788,7 @@ You cannot use EQL to search the values of a [`nested`](/reference/elasticsearch
788788
* If two pending sequences are in the same state at the same time, the most recent sequence overwrites the older one.
789789
* If the query includes [`by` fields](#eql-by-keyword), the query uses a separate state machine for each unique `by` field value.
790790

791-
:::::{dropdown} **Example**
791+
:::::{dropdown} Example
792792
A data set contains the following `process` events in ascending chronological order:
793793

794794
```js
@@ -831,13 +831,13 @@ The query’s event items correspond to the following states:
831831
* State B: `[process where process.name == "bash"]`
832832
* Complete: `[process where process.name == "cat"]`
833833

834-
:::{image} /images/sequence-state-machine.svg
834+
:::{image} ../images/sequence-state-machine.svg
835835
:alt: sequence state machine
836836
:::
837837

838838
To find matching sequences, the query uses separate state machines for each unique `user.name` value. Based on the data set, you can expect two state machines: one for the `root` user and one for `elkbee`.
839839

840-
:::{image} /images/separate-state-machines.svg
840+
:::{image} ../images/separate-state-machines.svg
841841
:alt: separate state machines
842842
:::
843843

Loading
Loading

docs/reference/query-languages/query-dsl/query-dsl-function-score-query.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -360,42 +360,42 @@ The `DECAY_FUNCTION` determines the shape of the decay:
360360
`gauss`
361361
: Normal decay, computed as:
362362

363-
![Gaussian](/images/Gaussian.png "")
363+
![Gaussian](../images/Gaussian.png "")
364364

365-
where ![sigma](/images/sigma.png "") is computed to assure that the score takes the value `decay` at distance `scale` from `origin`+-`offset`
365+
where ![sigma](../images/sigma.png "") is computed to assure that the score takes the value `decay` at distance `scale` from `origin`+-`offset`
366366

367-
![sigma calc](/images/sigma_calc.png "")
367+
![sigma calc](../images/sigma_calc.png "")
368368

369369
See [Normal decay, keyword `gauss`](#gauss-decay) for graphs demonstrating the curve generated by the `gauss` function.
370370

371371

372372
`exp`
373373
: Exponential decay, computed as:
374374

375-
![Exponential](/images/Exponential.png "")
375+
![Exponential](../images/Exponential.png "")
376376

377-
where again the parameter ![lambda](/images/lambda.png "") is computed to assure that the score takes the value `decay` at distance `scale` from `origin`+-`offset`
377+
where again the parameter ![lambda](../images/lambda.png "") is computed to assure that the score takes the value `decay` at distance `scale` from `origin`+-`offset`
378378

379-
![lambda calc](/images/lambda_calc.png "")
379+
![lambda calc](../images/lambda_calc.png "")
380380

381381
See [Exponential decay, keyword `exp`](#exp-decay) for graphs demonstrating the curve generated by the `exp` function.
382382

383383

384384
`linear`
385385
: Linear decay, computed as:
386386

387-
![Linear](/images/Linear.png "").
387+
![Linear](../images/Linear.png "").
388388

389389
where again the parameter `s` is computed to assure that the score takes the value `decay` at distance `scale` from `origin`+-`offset`
390390

391-
![s calc](/images/s_calc.png "")
391+
![s calc](../images/s_calc.png "")
392392

393393
In contrast to the normal and exponential decay, this function actually sets the score to 0 if the field value exceeds twice the user given scale value.
394394

395395

396396
For single functions the three decay functions together with their parameters can be visualized like this (the field in this example called "age"):
397397

398-
![decay 2d](/images/decay_2d.png "")
398+
![decay 2d](../images/decay_2d.png "")
399399

400400

401401
### Multi-values fields [_multi_values_fields]
@@ -510,10 +510,10 @@ Next, we show how the computed score looks like for each of the three possible d
510510

511511
When choosing `gauss` as the decay function in the above example, the contour and surface plot of the multiplier looks like this:
512512

513-
:::{image} /images/normal-decay-keyword-gauss-1.png
513+
:::{image} ../images/normal-decay-keyword-gauss-1.png
514514
:::
515515

516-
:::{image} /images/normal-decay-keyword-gauss-2.png
516+
:::{image} ../images/normal-decay-keyword-gauss-2.png
517517
:::
518518

519519
Suppose your original search results matches three hotels :
@@ -529,20 +529,20 @@ Suppose your original search results matches three hotels :
529529

530530
When choosing `exp` as the decay function in the above example, the contour and surface plot of the multiplier looks like this:
531531

532-
:::{image} /images/exponential-decay-keyword-exp-1.png
532+
:::{image} ../images/exponential-decay-keyword-exp-1.png
533533
:::
534534

535-
:::{image} /images/exponential-decay-keyword-exp-2.png
535+
:::{image} ../images/exponential-decay-keyword-exp-2.png
536536
:::
537537

538538
### Linear decay, keyword `linear` [linear-decay]
539539

540540
When choosing `linear` as the decay function in the above example, the contour and surface plot of the multiplier looks like this:
541541

542-
:::{image} /images/linear-decay-keyword-linear-1.png
542+
:::{image} ../images/linear-decay-keyword-linear-1.png
543543
:::
544544

545-
:::{image} /images/linear-decay-keyword-linear-2.png
545+
:::{image} ../images/linear-decay-keyword-linear-2.png
546546
:::
547547

548548
## Supported fields for decay functions [_supported_fields_for_decay_functions]

0 commit comments

Comments
 (0)