Skip to content

A multi match cross_fields query omits field boosts if there is only one field in that group #37551

Closed
@reupen

Description

@reupen

Elasticsearch version (bin/elasticsearch --version):

Version: 6.5.4, Build: default/deb/d2ef93d/2018-12-17T21:17:40.758843Z, JVM: 1.8.0_191

Plugins installed: []

JVM version (java -version):

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

OS version (uname -a if on a Unix-like system):

Linux xxx 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behaviour:

A multi match cross_fields query omits field boosts if there is only one field in that analyser group.

Steps to reproduce:

Set up an index with a couple of documents:

curl -XPUT "localhost:9200/cross-fields-test?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "_doc": {
      "properties": {
        "name": {
          "type": "text"
        },
        "category": {
          "type": "keyword"
        },
        "grouping": {
          "type": "keyword"
        }
      }
    }
  }
}
'

curl -XPUT "localhost:9200/cross-fields-test/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
    "name" : "red red",
    "category" : "blue",
    "grouping" : "green"
}
'

curl -XPUT "localhost:9200/cross-fields-test/_doc/2?pretty" -H 'Content-Type: application/json' -d'
{
    "name" : "blue",
    "category" : "red",
    "grouping" : "green"
}
'

Wait for the index to be refreshed, and run these two queries:

curl -XGET "localhost:9200/cross-fields-test/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
        "multi_match": {
            "query": "red",
            "fields" : ["name", "category^100"],
            "type": "cross_fields",
            "operator": "and"
        }
    }
}
'


curl -XGET "localhost:9200/cross-fields-test/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
        "multi_match": {
            "query": "red",
            "fields" : ["name", "grouping", "category^100"],
            "type": "cross_fields",
            "operator": "and"
        }
    }
}
'

In the first case, the result is similar to the following:

{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.39556286,
    "hits" : [
      {
        "_index" : "cross-fields-test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.39556286,
        "_source" : {
          "name" : "red red",
          "category" : "blue",
          "grouping" : "green"
        }
      },
      {
        "_index" : "cross-fields-test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "blue",
          "category" : "red",
          "grouping" : "green"
        }
      }
    ]
  }
}

And in the second case:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 28.76821,
    "hits" : [
      {
        "_index" : "cross-fields-test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 28.76821,
        "_source" : {
          "name" : "blue",
          "category" : "red",
          "grouping" : "green"
        }
      },
      {
        "_index" : "cross-fields-test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.39556286,
        "_source" : {
          "name" : "red red",
          "category" : "blue",
          "grouping" : "green"
        }
      }
    ]
  }
}

Adding a second keyword field (which does not contain the search term) has changed the scores significantly and the second document is now ranked higher (as was desired). Adding &explain=true to the URL in first query appears to confirm that the boost is ignored in that case.

Perhaps there is some rationale in that there is only one field in the group? However, as you can see, the boost would still have an effect on the order of the results.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions