Skip to content

Bug: When using graph synonym and stop token filter together #28838

@aslamy

Description

@aslamy

Elasticsearch 6.2.0

Description:
When using stop and graph synonym filters together, the document that should match doesn't match and highlight doesn't work as it should.

Step to reproduce:

Mapping

{  
   "settings":{  
      "analysis":{  
         "analyzer":{  
            "english_analyzer":{  
               "type":"custom",
               "filter":[  
                  "lowercase",
                  "english_stopwords_tokenfilter"
               ],
               "tokenizer":"standard"
            },
            "english_search_analyzer":{  
               "type":"custom",
               "filter":[  
                  "lowercase",
                  "synonym_graph_tokenfilter",
                  "english_stopwords_tokenfilter"
               ],
               "tokenizer":"standard"
            }
         },
         "filter":{  
            "english_stopwords_tokenfilter":{  
               "type":"stop",
               "stopwords":"_english_"
            },
            "synonym_graph_tokenfilter":{  
               "type":"synonym_graph",
               "synonyms":[  
                  "world of war, wow"
               ]
            }
         }
      }
   },
   "mappings":{  
      "doc":{  
         "properties":{  
            "title":{  
               "type":"text",
               "analyzer":"english_analyzer",
               "search_analyzer":"english_search_analyzer"
            }
         }
      }
   }
}

Indexing 3 documents

{  "title":"world of war"}
{  "title":"wow"}
{  "title":"world of war. wow"}

Search

{  
   "query":{  
      "match":{  
         "title":"world of war"
      }
   },
   "highlight":{  
      "fields":{  
         "title":{  
            "fragment_size":0,
            "type":"unified"
         }
      }
   }
}

Search Result:

{  
   "took":1,
   "timed_out":false,
   "_shards":{  
      "total":5,
      "successful":5,
      "skipped":0,
      "failed":0
   },
   "hits":{  
      "total":2,
      "max_score":0.2876821,
      "hits":[  
         {  
            "_index":"test",
            "_type":"doc",
            "_id":"2",
            "_score":0.2876821,
            "_source":{  
               "title":"world of war. wow"
            },
            "highlight":{  
               "title":[  
                  "world of war. <em>wow</em>"
               ]
            }
         },
         {  
            "_index":"test",
            "_type":"doc",
            "_id":"1",
            "_score":0.2876821,
            "_source":{  
               "title":"wow"
            },
            "highlight":{  
               "title":[  
                  "<em>wow</em>"
               ]
            }
         }
      ]
   }
}

Problems:
Bug 1. Document { "title":"world of war"} does not match. But it should match.
Bug 2. Highlighter does not highlight "world of war".

I have also tried to put synonym_graph_tokenfilter after english_stopwords_tokenfilter filter but I get:

{  
   "error":{  
      "root_cause":[  
         {  
            "type":"illegal_argument_exception",
            "reason":"failed to build synonyms"
         }
      ],
      "type":"illegal_argument_exception",
      "reason":"failed to build synonyms",
      "caused_by":{  
         "type":"parse_exception",
         "reason":"Invalid synonym rule at line 1",
         "caused_by":{  
            "type":"illegal_argument_exception",
            "reason":"term: world of war analyzed to a token (war) with position increment != 1 (got: 2)"
         }
      }
   },
   "status":400
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search Relevance/AnalysisHow text is split into tokens>bugTeam:Search RelevanceMeta label for the Search Relevance team in Elasticsearchpriority:normalA label for assessing bug priority to be used by ES engineers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions