Skip to content

[Feature Request][Nori ] Add custom dictionary terms on index setting. #35842

Closed
@kimjmin

Description

@kimjmin

Describe the feature:

Currently to use custom dictionary in Nori, only way is saving dictionary in file and set it's path on index/_settings/index/analysis.
https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori-tokenizer.html

PUT nori_sample
{
  "settings": {
    "index": {
      "analysis": {
        "tokenizer": {
          "nori_user_dict": {
            "type": "nori_tokenizer",
            "decompound_mode": "mixed",
            "user_dictionary": "userdict_ko.txt"
          }
        },
        "analyzer": {
          "my_analyzer": {
            "type": "custom",
            "tokenizer": "nori_user_dict"
          }
        }
      }
    }
  }
}

Unfortunately, there is no way for users who are using Elastic Cloud or ECE to add custom dictionary on their elasticsearch cluster. For Elastic Cloud, custom plugin menu can be used for file upload, but it is effective only when create cluster, which means can be refreshed on production system.

For synonym token filter, there is feature that user can add customer's dictionary on index setting.
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html

PUT /test_index
{
    "settings": {
        "index" : {
            "analysis" : {
                "analyzer" : {
                    "synonym" : {
                        "tokenizer" : "standard",
                        "filter" : ["my_stop", "synonym"]
                    }
                },
                "filter" : {
                        "my_stop": {
                                "type" : "stop",
                                "stopwords": ["bar"]
                        },
                    "synonym" : {
                        "type" : "synonym",
                        "lenient": true,
                        "synonyms" : ["foo, bar => baz"]
                    }
                }
            }
        }
    }
}

We need to add this kind of setting functionally on Nori (and also other analysis plugins if needed) so users on Elastic Cloud environment can use custom dictionary on their search use-cases.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions