Skip to content

Unexpected behavior of flattened field with changed ignore_above #48907

Closed
@kemics

Description

@kemics

Elasticsearch version (bin/elasticsearch --version):

docker.elastic.co/elasticsearch/elasticsearch:7.4.2

Plugins installed: []

JVM version (java -version):

docker.elastic.co/elasticsearch/elasticsearch:7.4.2

OS version (uname -a if on a Unix-like system):

docker.elastic.co/elasticsearch/elasticsearch:7.4.2

Description of the problem including expected versus actual behavior:

Steps to reproduce:

  1. Create index with flattened field.
PUT /test_index2
{
    "aliases": {},
    "mappings": {
        "properties": {
            "f": {
                "type": "flattened"
                
            }
        }
    }
}
  1. Change the default value of ignore_above
PUT /test_index2/_mapping/
{
        "properties": {
            "f": {
                        "type": "flattened",
                        "ignore_above": 100
                    }
            }
}
  1. Try to add a document that's length is bigger than 32766. I expect that ignore_above settings will stop this doc's field to be indexed.

This request is attached (because it is too long): step_3.txt

Provide logs (if relevant):

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Document contains at least one immense term in field=\"f\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[49, 50, 51, 52, 53, 54, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 49, 50, 51, 52]...', original message: bytes can be at most 32766 in length; got 63648"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Document contains at least one immense term in field=\"f\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[49, 50, 51, 52, 53, 54, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 49, 50, 51, 52]...', original message: bytes can be at most 32766 in length; got 63648",
    "caused_by": {
      "type": "max_bytes_length_exceeded_exception",
      "reason": "bytes can be at most 32766 in length; got 63648"
    }
  },
  "status": 400
}

So, I expect that changed ignore_above value will make error not happen, but happens. Also, these examples work as expected:
example_with_keyword.txt
example_with_set_ignore_above.txt
example_with_smaller_len.txt

Metadata

Metadata

Assignees

Labels

:Search Foundations/MappingIndex mappings, including merging and defining field types>bugTeam:Search FoundationsMeta label for the Search Foundations team in Elasticsearch

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions