Skip to content

Commit e5291a3

Browse files
richardwei2008medcl
authored andcommitted
chapter14_part4: /110_Multi_Field_Search/15_Best_field.asciidoc (#90)
* chapter14_part4: /110_Multi_Field_Search/15_Best_field.asciidoc 初译 * 修改 如果我们有个网站为用户允许博客内容搜索的功能》假设有个网站允许用户搜索博客的内容 * file name tag
1 parent 96a87a3 commit e5291a3

File tree

1 file changed

+19
-33
lines changed

1 file changed

+19
-33
lines changed

110_Multi_Field_Search/15_Best_field.asciidoc

Lines changed: 19 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
=== Best Fields
1+
[[_best_fields]]
2+
=== 最佳字段
23

3-
Imagine that we have a website that allows ((("multifield search", "best fields queries")))((("best fields queries")))users to search blog posts, such
4-
as these two documents:
4+
假设有个网站允许用户搜索博客的内容,((("multifield search", "best fields queries")))((("best fields queries")))以下面两篇博客内容文档为例:
55

66
[source,js]
77
--------------------------------------------------
@@ -19,13 +19,9 @@ PUT /my_index/my_type/2
1919
--------------------------------------------------
2020
// SENSE: 110_Multi_Field_Search/15_Best_fields.json
2121

22-
The user types in the words ``Brown fox'' and clicks Search. We don't
23-
know ahead of time if the user's search terms will be found in the `title` or
24-
the `body` field of the post, but it is likely that the user is searching for
25-
related words. To our eyes, document 2 appears to be the better match, as it
26-
contains both words that we are looking for.
22+
用户输入词组 “Brown fox” 然后点击搜索按钮。事先,我们并不知道用户的搜索项是会在 `title` 还是在 `body` 字段中被找到,但是,用户很有可能是想搜索相关的词组。用肉眼判断,文档 2 的匹配度更高,因为它同时包括要查找的两个词:
2723

28-
Now we run the following `bool` query:
24+
现在运行以下 `bool` 查询:
2925

3026
[source,js]
3127
--------------------------------------------------
@@ -42,7 +38,7 @@ Now we run the following `bool` query:
4238
--------------------------------------------------
4339
// SENSE: 110_Multi_Field_Search/15_Best_fields.json
4440

45-
And we find that this query gives document 1 the higher score:
41+
但是我们发现查询的结果是文档 1 的评分更高:
4642

4743
[source,js]
4844
--------------------------------------------------
@@ -68,34 +64,25 @@ And we find that this query gives document 1 the higher score:
6864
}
6965
--------------------------------------------------
7066

71-
To understand why, think about how the `bool` query ((("bool query", "relevance score calculation")))((("relevance scores", "calculation in bool queries")))calculates its score:
67+
为了理解导致这样的原因,((("bool query", "relevance score calculation")))((("relevance scores", "calculation in bool queries")))需要回想一下 `bool` 是如何计算评分的:
7268

73-
1. It runs both of the queries in the `should` clause.
74-
2. It adds their scores together.
75-
3. It multiplies the total by the number of matching clauses.
76-
4. It divides the result by the total number of clauses (two).
69+
1. 它会执行 `should` 语句中的两个查询。
70+
2. 加和两个查询的评分。
71+
3. 乘以匹配语句的总数。
72+
4. 除以所有语句总数(这里为:2)。
7773

78-
Document 1 contains the word `brown` in both fields, so both `match` clauses
79-
are successful and have a score. Document 2 contains both `brown` and
80-
`fox` in the `body` field but neither word in the `title` field. The high
81-
score from the `body` query is added to the zero score from the `title` query,
82-
and multiplied by one-half, resulting in a lower overall score than for document 1.
74+
文档 1 的两个字段都包含 `brown` 这个词,所以两个 `match` 语句都能成功匹配并且有一个评分。文档 2 的 `body` 字段同时包含 `brown` 和 `fox` 这两个词,但 `title` 字段没有包含任何词。这样, `body` 查询结果中的高分,加上 `title` 查询中的 0 分,然后乘以二分之一,就得到比文档 1 更低的整体评分。
75+
76+
在本例中, `title` 和 `body` 字段是相互竞争的关系,所以就需要找到单个 _最佳匹配_ 的字段。
77+
78+
如果不是简单将每个字段的评分结果加在一起,而是将 _最佳匹配_ 字段的评分作为查询的整体评分,结果会怎样?这样返回的结果可能是: _同时_ 包含 `brown` 和 `fox` 的单个字段比反复出现相同词语的多个不同字段有更高的相关度。
8379

84-
In this example, the `title` and `body` fields are competing with each other.
85-
We want to find the single _best-matching_ field.
8680

87-
What if, instead of combining the scores from each field, we used the score
88-
from the _best-matching_ field as the overall score for the query? This would
89-
give preference to a single field that contains _both_ of the words we are
90-
looking for, rather than the same word repeated in different fields.
9181

9282
[[dis-max-query]]
93-
==== dis_max Query
83+
==== dis_max 查询
9484

95-
Instead of the `bool` query, we can use the `dis_max` or _Disjunction Max
96-
Query_. Disjunction means _or_((("dis_max (disjunction max) query"))) (while conjunction means _and_) so the
97-
Disjunction Max Query simply means _return documents that match any of these
98-
queries, and return the score of the best matching query_:
85+
不使用 `bool` 查询,可以使用 `dis_max` 即分离 _最大化查询(Disjunction Max Query)_ 。分离(Disjunction)的意思是 _或(or)_ ,这与可以把结合(conjunction)理解成 _与(and)_ 相对应。分离最大化查询(Disjunction Max Query)指的是: _将任何与任一查询匹配的文档作为结果返回,但只将最佳匹配的评分作为查询的评分结果返回_ :
9986

10087
[source,js]
10188
--------------------------------------------------
@@ -112,7 +99,7 @@ queries, and return the score of the best matching query_:
11299
--------------------------------------------------
113100
// SENSE: 110_Multi_Field_Search/15_Best_fields.json
114101

115-
This produces the results that we want:
102+
得到我们想要的结果为:
116103

117104
[source,js]
118105
--------------------------------------------------
@@ -137,4 +124,3 @@ This produces the results that we want:
137124
]
138125
}
139126
--------------------------------------------------
140-

0 commit comments

Comments
 (0)