Skip to content

Word_delimiter and Word_delimiter_graph not working with numbers and spaces  #33710

Closed
@xDouglasx

Description

@xDouglasx

Describe the feature:

Elasticsearch version (bin/elasticsearch --version):6.4.0

Plugins installed: []

JVM version (java -version): 1.8.0_181"

Windows, Docker 18.06.1

Description of the problem including expected versus actual behavior:

The following index-setting and mapping script works in versions 5.6.6.X but fails when run against V 6.4.0:

Hi, im upgrading ElasticSearch from 5.6.6 to 6..4.0
I have an image running on docker.

When im ingesting data to elastic search
im having problem with word_delimiter.

according with this ticket
#28474

Changing the word_delimiter to word_delimiter_graph should fix the problem.
but i changed and the problem remains.

Analyzer:

"nameAnalyzer": {
"type": "custom",
"filter": ["lowercase",
"trim",
"word_delimiter_graph",
"name_stopwords",
"my_ascii_folding"],
"tokenizer": "nameNGram"
},

Data:

{"dateOfBirth":"1988-12-21","yearOfBirth":"1988","monthOfBirth":"12","dayOfBirth":"21","lastName":"Juan Humara","firstName":"Romeo","middleName":"H","suffix":""}

Logs:

{
"took": 14,
"errors": true,
"items": [
{
"index": {
"_index": "my-index",
"_type": "my_files",
"_id": "9PyI12UB7zhqwOMdQng1",
"status": 400,
"error": {
"type": "illegal_argument_exception",
"reason": "startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=4,endOffset=7,lastStartOffset=5 for field 'lastName'"
}
}
}
]
}

It happens because i have two names as last names "Juan Humara" and space between then.
If i Remove the space and put "JHumara" it works perfectly.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions