Skip to content

Commit f507381

Browse files
authored
Docs: Clarify constraints on scripted similarities. (#31076)
Scripted similarities provide a lot of flexibility but they still need to obey some rules to not confuse Lucene.
1 parent c7c0acc commit f507381

File tree

1 file changed

+12
-2
lines changed

1 file changed

+12
-2
lines changed

docs/reference/index-modules/similarity.asciidoc

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -341,7 +341,18 @@ Which yields:
341341
// TESTRESPONSE[s/"took": 12/"took" : $body.took/]
342342
// TESTRESPONSE[s/OzrdjxNtQGaqs4DmioFw9A/$body.hits.hits.0._node/]
343343

344-
You might have noticed that a significant part of the script depends on
344+
WARNING: While scripted similarities provide a lot of flexibility, there is
345+
a set of rules that they need to satisfy. Failing to do so could make
346+
Elasticsearch silently return wrong top hits or fail with internal errors at
347+
search time:
348+
349+
- Returned scores must be positive.
350+
- All other variables remaining equal, scores must not decrease when
351+
`doc.freq` increases.
352+
- All other variables remaining equal, scores must not increase when
353+
`doc.length` increases.
354+
355+
You might have noticed that a significant part of the above script depends on
345356
statistics that are the same for every document. It is possible to make the
346357
above slightly more efficient by providing an `weight_script` which will
347358
compute the document-independent part of the score and will be available
@@ -506,7 +517,6 @@ GET /index/_search?explain=true
506517
507518
////////////////////
508519

509-
510520
Type name: `scripted`
511521

512522
[float]

0 commit comments

Comments
 (0)